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Heregulin-Iike Factor 



Field of the Invention 

The present invention relates to a novel human gene encoding a 
polypeptide which is a novel member of the heregulin family. More 
specifically, isolated nucleic acid molecules are provided encoding a human 
polypeptide named heregulin-like factor, hereinafter referred to as "HLF". HLF 
polypeptides are also provided, as arc vectors, host cells and recombinant 
methods for producing the same. Also provided are diagnostic methods for 
detecting disorders related to primary cancers, and therapeutic methods for 
treating such disorders. The invention further relates to screening methods for 
identifying agonists and antagonists of HLF activity. 

Background of the Invention 

The proto-oncogene termed erbBl (or HER2) encodes a 185 kDa 
transmembrane tyrosine kinase molecule designated p\%5erbB2. The 
overexpression of this receptor molecule correlates strongly with a poor 
prognosis in a number of human cancers including, among others, breast, 
ovarian, endometrium, fallopian tube, cervix, and colon (Nowak, F., et al., 
Exp. Cell Res. 231:251-259; 1997; Cirisano, F. D. and Karlan, B. Y., J. Sac. 
Gynecol Investig, 3(3):99-105; 1996). Variously spliced transcripts of the 
heregulin (HRG) gene have been found to indirectly stimulate pl85enbB2 
through transphosphorylation or receptor heterodimerization with erbB3 and 
plSOerbBA. A 45 kDa protein, designated HRG-a, specifically induces tyrosine 
phosphorylation of pl85<?nbB2 and has been purified from the conditioned 
medium of a human breast tumor cell line (Holmes, W. E., et al., Science 
256:1205-1210; 1992). A second, related HRG molecule of 52 kDa, which 
may be the product of a novel gene, rather than a novel HRG gene splice 
product, has been identified which exhibits similar characteristics . including 
induction of transient membrane ruffling, lamellipodia formation, cell motility 
and proliferation of human breast cancer cells (Kung. W., et al., Biochetn. 
Biophys. Res. Commun. 202(3): 1357-1365; 1994). In addition, more recent 
studies have reported that heregulins can induce tyrosine phosphorylation not 
only of pl85e> a frB2, but of several additional EGFR-related family members 
including erbB3 and p]S0erbB4 (Tzahar, E., et al., J. Biol. Chem. 



269:25226-25223; 1994; Plowman, G. D.. et aL, Nature 366:473-475; 
1993). 

Lewis and colleagues (Cancer Res, 56: 1457-1465; 1996) recently 
performed an extensive analysis of the effects of the hercgulin family of proteins 
on a panel of breast and ovarian tumor cell lines. The biological responses to 
HRG were also compared to EGF and to the growth-inhibitory anti-£rZ?B2 
antibody 4D5. In nearly all cases, HRG stimulation of DNA synthesis 
correlated with positive effects on cell cycle progression and cell number and 
with enhancement of colony formation in soft agar. In addition to the effects of 
the heregulin family of proteins on breast and ovarian cells, similar effects have 
also been recently observed on human Schwann cells (Levi, A. D., et aL, J. 
Neurosci. 15(2): 1329-1340; 1995; Morrissey, T. K., et aL, Proc. NatL Acad. 
Sci. USA 92(5): 143 1-1435; 1995) suggesting that the heregulin family of 
proteins play a key role in the genesis of a number of cancers. 

The heregulin family of proteins consists at least of a number of splice 
variants of heregulin, the Neu differentiating factor, the glial growth factors-I, 
II. and -III, and a protein that stimulates muscle acetylcholine receptor synthsis 
f ARIA). In addition to the obvious role such polypeptides may play in 
oncogenic events, these proteins have also been exploited as Pseudomonas 
exotoxin A fusion proteins to inhibit the growth of several mammary carcinoma 
cell lines as well as to cause growth retardation of transplanted human breast 
minor cells in mice (Jeschke, M., et aL, Int. J. Cancer 60(5):730-739; 1995). 

Thus, there is a need for polypeptides that function as regulators of 
oncogenic events and existing tumors. Therefore, there is a need for 
identification and characterization of such human polypeptides which can play a 
role in detecting, preventing, ameliorating or correcting such disorders. 

Summary of the Invention 

The present invention provides isolated nucleic acid molecules 
comprising a polynucleotide encoding at least a portion of the HLF polypeptide 
having the complete amino acid sequence shown in SEQ ID NO:2 or the 
complete amino acid sequence encoded by the cDNA clone deposited was 
deposited as plasmid DNA with the American Type Culture Collection 
CATCC") on June 19, 1997, and assigned ATCC Deposit Number 209123. 
The ATCC is located at 10801 University Boulevard, Manassas, Virginia 
201 10-2209. The nucleotide sequence determined by sequencing the deposited 



HLF clone, which is shown in Figures 1 A and IB (SEQ ID NO: 1 ), contains an 
open reading frame encoding a complete polypeptide of 157 amino acid 
residues, beginning in frame with a serine residue at the amino-terminal end of 
the polypeptide corresponding to nucleotide positions 2-4, and a predicted 
molecular weight of about 17.7 kJDa. Nucleic acid molecules of the invention 
include those encoding the complete amino acid sequence shown in SEQ ID 
NO:2, or the complete amino acid sequence encoded by the cDNA clone in 
ATCC Deposit Number 209123, which molecules also can encode additional 
amino acids fused to the N-terminus of the HLF amino acid sequence. 

The HLF protein of the present invention shares sequence homology 
with the translation product of the human mRNA for heregulin (Figure 2; SEQ 
ID NO:3), including the following conserved domains: (a) the predicted 
extracellular domain of about 101 amino acids; (b) the predicted transmembrane 
domain of about 19 amino acids, and (c) the predicted intracellular domain of 
about 35 amino acids. Heregulin is thought to be important in oncogenesis. 
The homology between heregulin and HLF indicates that HLF may also be 
involved in oncogenesis. 

Thus, one aspect of the invention provides an isolated nucleic acid 
molecule comprising a polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: (a) a nucleotide sequence encoding the HLF 
polypeptide having the complete amino acid sequence in SEQ ID NO:2 (i.e., 
positions 1 to 157 of SEQ ID NO:2) or the complete amino acid sequence 
encoded by the cDNA clone contained in ATCC Deposit No. 209123; (b) a 
nucleotide sequence encoding the predicted extracellular domain of the HLF 
polypeptide having the amino acid sequence in SEQ ID NO:2 (i.e., positions 1 
to 101 of SEQ ID NO:2) or as encoded by the cDNA clone contained in ATCC 
Deposit No. 209123; (c) a nucleotide sequence encoding the predicted 
transmembrane domain of the HLF polypeptide having the amino acid sequence 
in SEQ ID NO:2 (i.e., positions 102 to 121 of SEQ ID NO:2) or as encoded by 
the cDNA clone contained in ATCC Deposit No. 209123; (d) a nucleotide 
sequence encoding the predicted intracellular domain of the HLF polypeptide 
having the amino acid sequence in SEQ ID NO:2 (i.e., positions 122 to 157 of 
SEQ ID NO:2) or as encoded by the cDN A clone contained in ATCC Deposit 
No. 209123; (e) a nucleotide sequence encoding a soluble HLF polypeptide 
having the extracellular and intracellular domains but lacking the transmembrane 
domain; and (f) a nucleotide sequence complementary to any of the nucleotide 
sequences in (a) through (e) above. 
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Further embodiments of the invention include isolated nucleic acid 
molecules that comprise a polynucleotide having a nucleotide sequence at least 
90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% 
identical to (or as stated in another way, a nucleotide sequence at most 10% 
different, and more preferably 5%, 4%, 3%, 2% or 1% different from), any of 
the nucleotide sequences in (a) through (f) above, or a polynucleotide which 
hybridizes under stringent hybridization conditions to a polynucleotide in (a) 
through (0 above. This polynucleotide which hybridizes does not hybridize 
under stringent hybridization conditions to a polynucleotide having a nucleotide 
sequence consisting of only A residues or of only T residues. An additional 
nucleic acid embodiment of the invention relates to an isolated nucleic acid 
molecule comprising a polynucleotide which encodes the amino acid sequence 
of an epitope-bearing portion of a HLF polypeptide having an amino acid 
sequence in (a) through (e) above. 

The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, and to host cells 
containing the recombinant vectors, as well as to methods of making such 
vectors and host cells and for using them for production of HLF polypeptides or 
peptides by recombinant techniques. 

The invention further provides an isolated HLF polypeptide comprising 
an amino acid sequence selected from the group consisting of: (a) the amino 
acid sequence of the HLF polypeptide having the complete amino acid sequence 
shown in SEQ ID NO:2 (i.e., positions 1 to 157 of SEQ ID NO:2) or the 
complete amino acid sequence encoded by the cDNA clone contained in the 
ATCC Deposit No. 209123; (b) the amino acid sequence of the predicted 
extracellular domain of the HLF polypeptide having the amino acid sequence 
shown in SEQ ID NO:2 (i.e., positions 1 to 101 of SEQ ID NO:2) or as 
encoded by the cDNA clone contained in the ATCC Deposit No. 209123; (c) 
the amino acid sequence of the predicted transmembrane domain of the HLF 
polypeptide having the amino acid sequence shown in SEQ ID NO:2 (i.e., 
positions 102 to 121 of SEQ ID NO:2) or as encoded by the cDNA clone 
contained in the ATCC Deposit No. 209123; (d) the amino acid sequence of the 
predicted intracellular domain of the HLF polypeptide having the amino acid 
sequence shown in SEQ ID NO:2 (i.e., positions 122 to 157 of SEQ ID NO:2) 
or as encoded by the cDNA clone contained in the ATCC Deposit No. 209123; 
and (e) the amino acid sequence of a soluble HLF polypeptide having the 
extracellular and intracellular domains but lacking the transmembrane domain. 
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The polypeptides of the present invention also include polypeptides having an 
amino acid sequence at least 80% identical (or at most 20% different), more 
preferably at least 90% identical (or at most 10% different), and still more 
preferably 95%, 96%, 97%, 98% or 99% identical to (or 5%, 4%, 3%, 2% or 
1% different from) those described in (a), (b), (c), (d), or (e) above, as well as 
polypeptides having an amino acid sequence with at least 90% similarity, and 
more preferably at least 95% similarity, to those above. 

An additional embodiment of this aspect of the invention relates to a 
peptide or polypeptide which comprises the amino acid sequence of an 
epitopc-bearing portion of a HLF polypeptide having an amino acid sequence 
described in (a), (b), (c), (d), or (e) above. Peptides or polypeptides having the 
amino acid sequence of an epitope-bearing portion of a HLF polypeptide of the 
invention include portions of such polypeptides with at least six or seven, 
preferably at least nine, and more preferably at least about 30 amino acids to 
about 50 amino acids, although epitope-bearing polypeptides of any length up to 
and including the entire amino acid sequence of a polypeptide of the invention 
described above also are included in the invention. 

In another embodiment, the invention provides an isolated antibody that 
binds specifically to a HLF polypeptide having an amino acid sequence 
described in (a), (b), (c), (d), or (c) above. The invention further provides 
methods for isolating antibodies that bind specifically to a HLF polypeptide 
having an amino acid sequence as described herein. Such antibodies arc useful 
diagnostically or therapeutically as described below. 

The invention also provides for pharmaceutical compositions comprising 
HLF polypeptides, particularly human HLF polypeptides, which may be 
employed, for instance, to treat many types of cancer. Methods of treating 
individuals in need of HLF polypeptides are also provided. 

The invention further provides compositions comprising a HLF 
polynucleotide or an HLF polypeptide for administration to cells in vitro* to 
cells ex vivo and to cells in vivo, or to a multicellular organism. In certain 
particularly preferred embodiments of this aspect of the invention, the 
compositions comprise a HLF polynucleotide for expression of a HLF 
polypeptide in a host organism for treatment of disease. Particularly preferred 
in this regard is expression in a human patient for treatment of a dysfunction 
associated with aberrant endogenous activity of a HLF 

The present invention also provides a screening method for identifying 
compounds capable of enhancing or inhibiting a biological activity of the HLF 
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polypeptide, which involves contacting a receptor which is inhibited or 
enhanced by the HLF polypeptide with the candidate compound in the presence 
of an HLF polypeptide, assaying changes in tyrosine phosphorylation states of 
the receptor and/or other molecules downstream in the corresponding signal 
transduction cascade in the presence of the candidate compound and of HLF 
polypeptide, and comparing the receptor activation state to a standard level, the 
standard being assayed when contact is made between the receptor and in the 
presence of the HLF polypeptide and the absence of the candidate compound In 
this assay, an increase in receptor activation state over the standard indicates that 
the candidate compound is an agonist of HLF activity and a decrease in receptor 
activation state compared to the standard indicates that the compound is an 
antagonist of HLF activity. 

In another aspect, a screening assay for agonists and antagonists is 
provided which involves determining the effect a candidate compound has on 
HLF binding to a receptor. In particular, the method involves contacting the 
receptor with an HLF polypeptide and a candidate compound and determining 
whether HLF polypeptide binding to the receptor is increased or decreased due 
to the presence of the candidate compound. In this assay, an increase in binding 
of HLF over the standard binding indicates that the candidate compound is an 
agonist of HLF binding activity and a decrease in HLF binding compared to the 
standard indicates that the compound is an antagonist of HLF binding activity. 

It has been discovered that HLF is expressed only in the amygdala, 
whole brain, and primary breast culture tissue. Therefore, nucleic acids of the 
invention are useful as hybridization probes for differential identification of the 
tissue(s) or cell type(s) present in a biological sample. Similarly, polypeptides 
and antibodies directed to those polypeptides are useful to provide 
immunological probes for differential identification of the tissue(s) or cell 
type(s). In addition, for a number of disorders of the above tissues or cells, 
particularly of the neural system, significantly higher or lower levels of HLF 
gene expression may be detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) taken from an individual having such a disorder, relative to a 
' standard" HLF gene expression level, i.e., the HLF expression level in healthy 
tissue from an individual not having the neural system disorder. Thus, the 
invention provides a diagnostic method useful during diagnosis of such a 
disorder, which involves: (a) assaying HLF gene expression level in cells or 
body fluid of an individual; (b) comparing the HLF gene expression level with a 
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standard HLF gene expression level, whereby an increase or decrease in the 
assayed HLF gene expression level compared to the standard expression level is 
indicative of disorder in the neural system. 

An additional aspect of the invention is related to a method for treating 
an individual in need of an increased level of HLF activity in the body 
comprising administering to such an individual a composition comprising a 
therapeutically effective amount of an isolated HLF polypeptide of the invention 
or an agonist thereof. 

A still further aspect of the invention is related to a method for treating 
an individual in need of a decreased level of HLF activity in the body 
comprising, administering to such an individual a composition comprising a 
therapeutically effective amount of an HLF antagonist. Preferred antagonists 
for use in the present invention are HLF-specific antibodies. 

Brief Description of the Figures 

Figures 1 A and IB shows the nucleotide sequence (SEQ ID NO: 1) and 
deduced amino acid sequence (SEQ ID NO;2) of HLF. An extracellular 
epidermal growth factor (EGF) domain, conserved in many other EGF-likc 
polypeptides, is underlined in Figures 1 A and IB. 

Figure 2 shows the regions of identity between the amino acid 
sequences of the HLF protein and translation product of the human mRNA for 
heregulin (SEQ ID NO:3), determined by the computer program Bestfit 
(Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive, Madison, Wl 
5371 1) using the default parameters. 

Figure 3 shows an analysis of the HLF amino acid sequence. Alpha, 
beta, turn and coil regions; hydrophilicity and hydrophobicity; amphipathic 
regions; flexible regions; antigenic index and surface probability are shown. In 
the "Antigenic Index - Jameson-Wolf graph, the positive peaks indicate 
locations of the highly antigenic regions of the HLF protein, i.e., regions from 
which epitope-bearing peptides of the invention can be obtained. 

Figure 4 shows a demonstration of the biochemical activity of a 
recombinant EGF domain of the HLF protein (designated "H" in the figure; as 
described in Example 5). The figure shows a Western blot of MCF-7 cell 
lysatcs prepared from cultures which were treated or mock-treated with 
recombinant EGF domain of the HLF protein or with recombinant heregulin. 
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The blots were immunoblotted with an anti-phosphotyrosine monoclonal 
antibody. 

Figure 5 shows the amino acid sequences of the EGF/heregulin family 
of growth factors and of the NRG-3 novel sequences. Cysteines (C) defining 
5 the basic structure of the EGF domain and highly conserved amino acids are in 

bold. Listed are sequences for the EGF-like domains of transforming growth 
factor (TGF)-a (SEQ ID NO: 1 1 ); epidermal growth factor (EGF; SEQ ID 
NO: 1 2); heparin-binding EGF (HB-EGF; SEQ ID NO: 13); amphiregulin 
(Amph; SEQ ID NO: 14); b-cellulin (b-Cell; SEQ ID NO: 15); neuregulin (neuR; 
10 SEQIDNO:16);humanheregulins 1-a (Hrgal; SEQ ID NO: 17) and 1-b 

(HRGbl ; SEQ ID NO: 18); heregulin-related gene (HRG)-2 (SEQ ID NO: 19); 
and amino acids 29-80 of HLF of the present invention (amino acids 29-80 of 
SEQ ID NO:2). 

Detailed Description 

1 5 The present invention provides isolated nucleic acid molecules 

comprising a polynucleotide encoding a HLF polypeptide having the amino acid 
sequence shown in SEQ ID NO:2. which was determined by sequencing a 
cloned cDNA. The nucleotide sequence shown in Figures 1 A and IB (SEQ ID 
NO: 1 ) was obtained by sequencing the HAGFE38 clone, which was deposited 

20 on June 19, 1997 at the American Type Culture Collection, 12301 Park Lawn 

Drive, Rockville, Maryland 20852 (the ATCC is now located at 10801 
University Blvd., Manassas, VA 201 10-2209), and given accession number 
ATCC 209123. The deposited clone is contained in the pBluescript SK(-) 
plasmid (Stratagene, La Jolla, CA). 

25 The HLF protein of the present invention shares sequence homology 

with the translation product of the human mRNA for heregulin (Figure 2; SEQ 
ID NO:3). Heregulin is thought to be an important molecule in the activation 
pathways of the erbB family of cell surface receptors. Altered expression of 
heregulin and related ligand molecules, and/or the erbB family of receptor 

30 molecules can often lead to the loss of regulation of cellular growth and 

ultimately to oncogenesis. For example, the neu differentiation factor (NDF) is 
a homologue of both heregulin and HLF. NDF is a neuron/glia-specific 
signaling molecule which has been observed to regulate survival, proliferation, 
and maturation of Schwann cell precursors (Dong, Z., et ai, Neuron 

35 15:585-596; 1995; Marchionni, M. A., et al. t Nature 362:312-318; 1993). 



HLF is a member of the same heregulin family of proteins and has, at least, 
activities similar to those described above for heregulin and NDF. 

Nucleic Acid Molecules 

Unless otherwise indicated, all nucleotide sequences determined by 
sequencing a DNA molecule herein were determined using an automated DNA 
sequencer (such as the Model 373 from Applied Biosystems, Inc., Foster City, 
CA) T and all amino acid sequences of polypeptides encoded by DNA molecules 
determined herein were predicted by translation of a DNA sequence determined 
as above. Therefore, as is known in the art for any DNA sequence determined 
by this automated approach, any nucleotide sequence determined herein may 
contain some errors. Nucleotide sequences determined by automation are 
typically at least about 90% identical (or 10% different), more typically at least 
about 95% to at least about 99.9% identical to (or at most about 5% to at most 
about 0.1% different from) the actual nucleotide sequence of the sequenced 
DNA molecule. The actual sequence can be more precisely determined by other 
approaches including manual DNA sequencing methods well known in the art. 
As is also known in the art, a single insertion or deletion in a determined 
nucleotide sequence compared to the actual sequence will cause a frame shift in 
translation of the nucleotide sequence such that the predicted amino acid 
sequence encoded by a determined nucleotide sequence will be completely 
different from the amino acid sequence actually encoded by the sequenced DNA 
molecule, beginning at the point of such an insertion or deletion. 

By "nucleotide sequence" of a nucleic acid molecule or polynucleotide is 
intended, for a DNA molecule or polynucleotide, a sequence of 
deoxyribonucleotides, and for an RNA molecule or polynucleotide, the 
corresponding sequence of ribonucleotides (A, G, C and U), where each 
thymidine deoxyribonucleotide (T) in the specified deoxyribonucleotide 
sequence is replaced by the ribonucleotide uridine (U). 

Using the information provided herein, such as the nucleotide sequence 
in Figures 1 A and IB (SEQ ID NO; 1), a nucleic acid molecule of the present 
invention encoding a HLF polypeptide may be obtained using standard cloning 
and screening procedures, such as those for cloning cDNAs using mRNA as 
starting material. Illustrative of the invention, the nucleic acid molecule 
described in Figures 1A and IB (SEQ ID NO: 1) was discovered in a cDNA 
library derived from human amygdala. 
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The determined nucleotide sequence of the HLF cDNA of Figures 1 A 
and IB (SEQ ID NO: 1 ) contains an open reading frame encoding a protein of 
157 amino acid residues, with an amino-terminal serine codon at nucleotide 
positions 2-4 of the nucleotide sequence in Figure 1A (SEQ ID NO:l), and a 
deduced molecular weight of about 17.7 kDa. The amino acid sequence of the 
HLF protein shown in SEQ ID NO:2 is about 32.7% identical to human mRNA 
for heregulin (Figure 2). The nucleotide and amino acid sequence of human 
heregulin has been reported by Holmes and colleagues {Science 256:1205- 
1210; 1992; GenBank Accession No. M94166). 

The open reading frame of the HLF gene shares sequence homology 
with the translation product of the human mRNA for heregulin (Figure 2; SEQ 
ID NO:3), including the conserved EGF domain in HLF of about 67 amino 
acids (amino acids 26-93 of SEQ ID NO:2). Heregulin is thought to be 
important in the regulation of the activation state of the erbB family of cell 
surface receptors, in the regulation of cellular growth control, and ultimately in 
the regulation of oncogenesis. The homology between heregulin and HLF 
indicates that HLF may also be involved in the regulation of the activation state 
of the erbB family of cell surface receptors, in the regulation of cellular growth 
control, and ultimately in the regulation of oncogenesis. 

As one of ordinary skill would appreciate, due to the possibilities of 
sequencing errors discussed above, the actual complete HLF polypeptide 
encoded by the deposited cDNA, which comprises about 157 amino acids, may 
be somewhat longer or shorter. More generally, the actual open reading frame 
may be anywhere in the range of ±100 amino acids, ±20 amino acids, more 
likely in the range of ±10 amino acids, of that predicted from the serine codon at 
the N-terminus shown in Figure 1A (SEQ ID NO:l). It will further be 
appreciated that, depending on the analytical criteria used for identifying various 
functional domains, the exact "address" of the extracellular, EGF, 
transmembrane, and intracellular domains of the HLF polypeptide may differ 
slightly from the predicted positions above. For example, the exact location of 
the HLF EGF domain in SEQ ID NO:2 may vary slightly (e.g., the address may 
"shift" by about 1 to about 20 residues, more likely about 1 to about 5 residues) 
depending on the criteria used to define the domain. In this case, the ends of the 
transmembrane domain and the beginning of the EGF domain were predicted on 
the basis of the identification of the conserved cysteine residues at positions 35, 
43, 49, 62, 64, and 73 of SEQ ID NO:2, as shown in Figure 1 A. In any event, 



1 ] 

as discussed further below, the invention further provides polypeptides having 
various residues deleted from the N-terminus of the complete polypeptide, 
including polypeptides lacking one or more amino acids from the N-terminus of 
the extracellular EGF domain described herein, which constitute soluble forms 
of the extracellular EGF domain of the HLF protein. 

Leader and Mature Sequences 

In addition, methods for predicting whether a protein has a secretory 
leader as well as the cleavage point for that leader sequence are available. For 
instance, the method of McGeoch (Virus Res. 3:271-286; 1985) uses the 
information from a short N-terminal charged region and a subsequent uncharged 
region of the complete (uncleaved) protein. The method of von Heinje (Nucleic 
Acids Res. 14:4683-4690; 1986) uses the information from the residues 
surrounding the cleavage site, typically residues -13 to +2 where +1 indicates 
the amino terminus of the mature protein. The accuracy of predicting the 
cleavage points of known mammalian secretory proteins for each of these 
methods is in the range of 75-80% (von Heinje, supra). However, the two 
methods do not always produce the same predicted cleavage point(s) for a given 
protein. 

In the present case, the deduced amino acid sequence of the complete 
HLF polypeptide was analyzed by the computer program PSORT, available 
from Dr. Kenta Nakai of the Institute for Chemical Research, Kyoto University 
(Nakai, K. and Kanehisa, M., Genomics 14:897-91 1; 1992), which is an 
expert system for predicting the cellular location of a protein based on the amino 
acid sequence. As pan of this computational prediction of localization, the 
methods of McGeoch and von Heinje are incorporated. The analysis of the 
HLF amino acid sequence by this program indicated that there appears to be no 
N-terminal signal sequence associated with the HLF amino acid sequence 
shown in SEQ ID NO:2, and that the HLF molecule, as shown in SEQ ID 
NO:2, appears to be a type lb membrane protein. 

As indicated, nucleic acid molecules of the present invention may be in 
the form of RNA, such as mRNA, or in the form of DN A, including, for 
instance, cDNA and genomic DNA obtained by cloning or produced 
synthetically. The DNA may be double-stranded or single-stranded. 
Single-stranded DNA or RNA may be the coding strand, also known as the 
sense strand, or it may be the non-coding strand, also referred to as the 
anti-sense strand. 
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By "isolated" nucleic acid molecule(s) is intended a nucleic acid 
molecule, DNA or RNA, which has been removed from its native environment 
For example, recombinant DNA molecules contained in a vector are considered 
isolated for the purposes of the present invention. Further examples of isolated 
DNA molecules include recombinant DNA molecules maintained in 
heterologous host cells or purified (partially or substantially) DNA molecules in 
solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of 
the DNA molecules of the present invention. Isolated nucleic acid molecules 
according to the present invention further include such molecules produced 
synthetically. 

Isolated nucleic acid molecules of the present invention include DNA 
molecules comprising an open reading frame (ORF) beginning in frame with a 
serine codon at positions 2-4 of the nucleotide sequence shown in Figure 1A 
(SEQ ID NO:l). 

In addition, isolated nucleic acid molecules of the invention include 
DNA molecules which comprise a sequence substantially different from those 
described above but which, due to the degeneracy of the genetic code, still 
encode the HLF protein. Of course, the genetic code and species-specific codon 
preferences are well known in the art. Thus, it would be routine for one skilled 
in the an to generate the degenerate variants described above, for instance, to 
optimize codon expression for a particular host (e.g., change codons in the 
human mRNA to those preferred by a bacterial host such as E. coli). 

In another aspect, the invention provides isolated nucleic acid molecules 
encoding the HLF polypeptide having an amino acid sequence encoded by the 
cDNA clone contained in the plasmid deposited as ATCC Deposit No. 209123 
on June 19, 1997. 

The invention further provides an isolated nucleic acid molecule having 
the nucleotide sequence shown in Figures 1 A and IB (SEQ ID NO: 1) or the 
nucleotide sequence of the HLF cDN A contained in the above-described 
deposited clone, or a nucleic acid molecule having a sequence complementary 
to one of the above sequences. Such isolated molecules, particularly DNA 
molecules, are useful as probes for gene mapping, by in situ hybridization with 
chromosomes, and for detecting expression of the HLF gene in human tissue, 
for instance, by Northern blot analysis. 

The present invention is further directed to nucleic acid molecules 
encoding portions of the nucleotide sequences described herein as well as to 
fragments of the isolated nucleic acid molecules described herein. In particular, 
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the invention provides a polynucleotide having a nucleotide sequence 
representing the portion of SEQ ID NO: 1 which consists of positions 1-2199 of 
SEQ ID NO:l. 

In addition, the invention provides nucleic acid molecules having 
nucleotide sequences related to a portion of SEQ ID NO: 1 which has been 
determined from the following related cDNA clone: HAGFE38R. 

Further, the invention includes a polynucleotide comprising any portion 
of at least about 30 nucleotides, preferably at least about 50 nucleotides, of SEQ 
ID NO: 1 from residue about 1 to about 220 and from about 400 to 2199. More 
preferably, the invention includes a polynucleotide comprising nucleotide 
residues 1 to 2199, 1 to 1500, 1 to 1000, 1 to 500, 1 to 250, 250 to 2199, 250 
to 1500, 250 to 1000, 250 to 500, 500 to 2199, 500 to 1500, 500 to 1000, 
1000 to 2199, and 1000 to 1500. 

More generally, by a fragment of an isolated nucleic acid molecule 
having the nucleotide sequence of the deposited cDNA or the nucleotide 
sequence shown in Figures 1 A and IB (SEQ ID NO: 1 ) is intended fragments at 
least about 15 nt, and more preferably at least about 20 nt, still more preferably 
at least about 30 nt, and even more preferably, at least about 40 nt in length 
which are useful as diagnostic probes and primers as discussed herein. Of 
course, larger fragments 50-300 nt in length are also useful according to the 
present invention as are fragments corresponding to most, if not all, of the 
nucleotide sequence of the deposited cDNA or as shown in Figures 1 A and IB 
(SEQ ID NO: 1). By a fragment at least 20 nt in length, for example, is intended 
fragments which include 20 or more contiguous bases from the nucleotide 
sequence of the deposited cDNA or the nucleotide sequence as shown in 
Figures 1 A and IB (SEQ ID NO:l). Preferred nucleic acid fragments of the 
present invention include nucleic acid molecules encoding epitope-bearing 
portions of the HLF polypeptide as identified in Figure 3 and described in more 
detail below. Preferred nucleic acid fragments of the present invention also 
include nucleic acid molecules encoding Garnier-Robson and/or Chou-Fasman 
alpha, beta, and/or turn regions, Garnier-Robson coil regions, Kyte-Doolittle 
hydrophilic regions, Hopp-Woods hydrophobic regions, Eisenberg alpha 
and/or beta amphipathic regions, Karplus-Schulz flexible regions, 
Jameson-Wolf antigenic regions, and/or Emini surface probability regions of the 
HLF polypeptide as identified in Figure 3 or in a tabular representation of the 
data presented in Figure 3. 



14 



In another aspect, the invention provides an isolated nucleic acid 
molecule comprising a polynucleotide which hybridizes under stringent 
hybridization conditions to a portion of the polynucleotide in a nucleic acid 
molecule of the invention described above, for instance, the cDN A clone 
contained in ATCC Deposit No. 209123, or, for example, any specific HLF 
polynucleotide fragment described above (a non-limiting example is a 
Chou-Fasman alpha turn region). By "stringent hybridization conditions" is 
intended overnight incubation at 42° C in a solution comprising: 50% 
formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium 
phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 |ig/ml 
denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 x 
SSC at about 65° C. 

By a polynucleotide which hybridizes to a "portion" of a polynucleotide 
is intended a polynucleotide (either DNA or RNA) hybridizing to at least about 
1 5 nucleotides (nt), and more preferably at least about 20 nt, still more 
preferably at least about 30 nt, and even more preferably about 30-70 (e.g., 50) 
nt of the reference polynucleotide. These are useful as diagnostic probes and 
primers as discussed above and in more detail below. 

By a portion of a polynucleotide of "at least 20 nt in length," for 
example, is intended 20 or more contiguous nucleotides from the nucleotide 
sequence of the reference polynucleotide (e.g., the deposited cDN A or the 
nucleotide sequence as shown in Figures 1 A and IB (SEQ ID NO: 1)). Of 
course, a polynucleotide which hybridizes only to a poly A sequence (such as 
the 3* terminal poly(A) tract of the HLF cDNA shown in Figure IB (SEQ ID 
NO: 1)), or to a complementary stretch of T (or U) residues, would not be 
included in a polynucleotide of the invention used to hybridize to a portion of a 
nucleic acid of the invention, since such a polynucleotide would hybridize to 
any nucleic acid molecule containing a poly (A) stretch or the complement 
thereof (e.g., practically any double-stranded cDNA clone). 

As indicated, nucleic acid molecules of the present invention which 
encode an HLF polypeptide may include, but are not limited to those encoding 
the amino acid sequence of the complete polypeptide, by itself; and the coding 
sequence for the complete polypeptide and additional sequences, such as those 
encoding an added secretory leader sequence, such as a pre-, or pro- or prepro- 
protein sequence. 

Also encoded by nucleic acids of the invention are the above protein 
sequences together with additional, non-coding sequences, including for 
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example, but not limited to introns and non-coding 5' and 3' sequences, such as 
the transcribed, non-translated sequences that play a role in transcription, 
mRNA processing, including splicing and polyadenylation signals, for example 
- ribosome binding and stability of mRNA; an additional coding sequence which 
codes for additional amino acids, such as those which provide additional 
functionalities. 

Thus, the sequence encoding the polypeptide may be fused to a marker 
sequence, such as a sequence encoding a peptide which facilitates purification of 
the fused polypeptide. In certain preferred embodiments of this aspect of the 
invention, the marker amino acid sequence is a hexa-histidine peptide, such as 
the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 91311), among others, many of which are commercially 
available. As described in Gentz et al, Proc. Natl Acad. Sci. USA #6:821-824 
(1989), for instance, hexa-histidine provides for convenient purification of the 
fusion protein. The "HA" tag is another peptide useful for purification which 
corresponds to an epitope derived from the influenza hemagglutinin protein, 
which has been described by Wilson et aL Cell 37: 767 (1984). As discussed 
below, other such fusion proteins include the HLF fused to Fc at the N- or 
C-terminus. 

Variant and Mutant Polynucleotides 

The present invention further relates to variants of the nucleic acid 
molecules of the present invention, which encode portions, analogs or 
derivatives of the HLF protein. Variants may occur naturally, such as a natural 
allelic variant. By an "allelic variant" is intended one of several alternate forms 
of a gene occupying a given locus on a chromosome of an organism. Genes //, 
Lewin, B., ed., John Wiley & Sons, New York (1985). Non-naturally 
occurring variants may be produced using art-known mutagenesis techniques. 

Such variants include those produced by nucleotide substitutions, 
deletions or additions. The substitutions, deletions or additions may involve 
one or more nucleotides. The variants may be altered in coding regions, 
non-coding regions, or both. Alterations in the coding regions may produce 
conservative or non-conservative amino acid substitutions, deletions or 
additions. Especially preferred among these are silent substitutions, additions 
and deletions, which do not alter the properties and activities of the HLF protein 
or portions thereof. Also especially preferred in this regard are conservative 
substitutions. 
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Most highly preferred are nucleic acid molecules encoding the EGF 
domain of the protein having the amino acid sequence shown in SEQ ID NO:2 
or the EGF domain of the HLF amino acid sequence encoded by the deposited 
cDN A clone (nucleotides 77-280 of SEQ ID NO: 1 ). 

Thus, one aspect of the invention provides an isolated nucleic acid 
molecule comprising a polynucleotide having a nucleotide sequence selected 
from the group consisting of: (a) a nucleotide sequence encoding the HLF 
polypeptide having the complete amino acid sequence in SEQ ID NO:2 (i.e., 
positions 1 to 1 57 of SEQ ID NO:2) or the complete amino acid sequence 
encoded by the cDNA clone contained in ATCC Deposit No. 209123; (b) a 
nucleotide sequence encoding the predicted extracellular domain of the HLF 
polypeptide having the amino acid sequence in SEQ ID NO:2 (i.e., positions 1 
to 101 of SEQ ID NO:2) or as encoded by the cDNA clone contained in ATCC 
Deposit No. 209123; (c) a nucleotide sequence encoding the predicted 
transmembrane domain of the HLF polypeptide having the amino acid sequence 
in SEQ IDNO:2 (i.e., positions 102 to 121 ofSEQIDNO:2) or as encoded by 
the cDNA clone contained in ATCC Deposit No. 209123; (d) a nucleotide 
sequence encoding the predicted intracellular domain of the HLF polypeptide 
having the amino acid sequence in SEQ ID NO:2 (i.e., positions 122 to 157 of 
SEQ ID NO:2) or as encoded by the cDNA clone contained in ATCC Deposit 
No. 209123; (e) a nucleotide sequence encoding a soluble HLF polypeptide 
having the extracellular and intracellular domains but lacking the transmembrane 
domain; and (0 a nucleotide sequence complementary to any of the nucleotide 
sequences in (a) through (e) above. 

Further embodiments of the invention include isolated nucleic acid 
molecules that comprise a polynucleotide having a nucleotide sequence at least 
90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% 
identical, to any of the nucleotide sequences in (a) ? (b), (c), (d), (e) or (f), 
above, or a polynucleotide which hybridizes under stringent hybridization 
conditions to a polynucleotide in (a), (b), (c), (d), (e) or (f), above. This 
polynucleotide which hybridizes does not hybridize under stringent 
hybridization conditions to a polynucleotide having a nucleotide sequence 
consisting of only A residues or of only T residues. An additional nucleic acid 
embodiment of the invention relates to an isolated nucleic acid molecule 
comprising a polynucleotide which encodes the amino acid sequence of an 
epitope-bearing portion of an HLF polypeptide having an amino acid sequence 
in (a), (b), (c), (d) or (e), above. 
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The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, and to host cells 
containing the recombinant vectors, as well as to methods of making such 
vectors and host cells and for using them for production of HLF polypeptides or 
5 peptides by recombinant techniques. 

By a polynucleotide having a nucleotide sequence at least, for example, 
95% "identical" to a reference nucleotide sequence encoding an HLF 
polypeptide is intended that the nucleotide sequence of the polynucleotide is 
identical to the reference sequence except that the polynucleotide sequence may 

10 include up to five point mutations per each 100 nucleotides of the reference 

nucleotide sequence encoding the HLF polypeptide. In other words, to obtain a 
polynucleotide having a nucleotide sequence at least 95% identical to a reference 
nucleotide sequence, up to 5% of the nucleotides in the reference sequence may 
be deleted or substituted with another nucleotide, or a number of nucleotides up 

15 to 5% of the total nucleotides in the reference sequence may be inserted into the 

reference sequence. These mutations of the reference sequence may occur at the 
5' or 3' terminal positions of the reference nucleotide sequence or anywhere 
between those terminal positions, interspersed either individually among 
nucleotides in the reference sequence or in one or more contiguous groups 

20 within the reference sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 90%, 95%, 96%, 97%, 98% or 99% identical to (or at most 10%, 5%, 
4%, 3%, 2% or 1% different from), for instance, the nucleotide sequence 
shown in Figures 1 A and IB or to the nucleotides sequence of the deposited 

25 cDN A clone can be determined conventionally using known computer programs 

such as the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 
for Unix, Genetics Computer Group, University Research Park, 575 Science 
Drive, Madison, WI 5371 1). Bestfit uses the local homology algorithm of 
Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981), to 

30 find the best segment of homology between two sequences. When using Bestfit 

or any other sequence alignment program to determine whether a particular 
sequence is, for instance, 95% identical to a reference sequence according to the 
present invention, the parameters are set, of course, such that the percentage of 
identity is calculated over the full length of the reference nucleotide sequence 

35 and that gaps in homology of up to 5% of the total number of nucleotides in the 

reference sequence are allowed. 
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By a polynucleotide having a nucleotide sequence at least, for example, 
95% "identical" to (or 5% different from) a reference nucleotide sequence of the 
present invention, it is intended that the nucleotide sequence of the 
polynucleotide is identical to the reference sequence except that the 
polynucleotide sequence may include up to five point mutations per each 100 
nucleotides of the reference nucleotide sequence encoding the HLF polypeptide. 
In other words, to obtain a polynucleotide having a nucleotide sequence at least 
95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in 
the reference sequence may be deleted or substituted with another nucleotide, or 
a number of nucleotides up to 5% of the total nucleotides in the reference 
sequence may be inserted into the reference sequence. The query sequence may 
be an entire sequence shown in SEQ ID NO: 1 , the ORF (open reading frame), 
or any fragement specified as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to (or at 
most 10%, 5%, 4%, 3%, 2% or 1% different from) a nucleotide sequence of 
the presence invention can be determined conventionally using known computer 
programs. A preferred method for determing the best overall match between a 
query sequence (a sequence of the present invention) and a subject sequence, 
also referred to as a global sequence alignment, can be determined using the 
FASTDB computer program based on the algorithm of Brutlag et al. (Comp. 
App. Biosci. (1990) 6:237-245). In a sequence alignment the query and subject 
sequences are both DNA sequences. An RNA sequence can be compared by 
converting U\s to Ts. The result of said global sequence alignment is in 
percent identity. Preferred parameters used in a FASTDB alignment of DNA 
sequences to calculate percent identiy are: Matrix=Unitary, k-tuple=4, 
Mismatch Penalty=l, Joining Penalty=30, Randomization Group Length=0, 
Cutoff Scores 1, Gap Penalty=5, Gap Size Penalty 0.05, Window Size=500 or 
the lenght of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5' 
or 3' deletions, not because of internal deletions, a manual correction must be 
made to the results. This is becuase the FASTDB program does not account for 
5' and 3' truncations of the subject sequence when calculating percent identity. 
For subject sequences truncated at the 5' or 3' ends, relative to the the query 
sequence, the percent identity is corrected by calculating the number of bases of 
the query sequence that are 5' and 3' of the subject sequence, which are not 
matched/aligned, as a percent of the total bases of the query sequence. Whether 
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a nucleotide is matched/aligned is determined by results of the FASTDB 
sequence alignment. This percentage is then subtracted from the percent 
identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This corrected score is 
5 what is used for the purposes of the present invention. Only bases outside the 

5' and 3' bases of the subject sequence, as displayed by the FASTDB 
alignment, which are not matched/aligned with the query sequence, are 
calculated for the purposes of manually adjusting the percent identity score. 

For example, a 90 base subject sequence is aligned to a 100 base query 

10 sequence to determine percent identity. The deletions occur at the 5' end of the 

subject sequence and therefore, the FASTDB alignment does not show a 
matched/alignement of the first 10 bases at 5" end. The 10 unpaired bases 
represent 10% of the sequence (number of bases at the 5' and 3* ends not 
matched/total number of bases in the query sequence) so 10% is subtracted from 

1 5 the percent identity score calculated by the FASTDB program. If the remaining 

90 bases were perfectly matched the final percent identity would be 90%. In 
another example, a 90 base subject sequence is compared with a 100 base query 
sequence. This time the deletions are internal deletions so that there are no 
bases on the 5* or 3' of the subject sequence which are not matched/aligned with 

20 the query. In this case the percent identity calculated by FASTDB is not 

manually corrected. Once again, only bases 5" and 3' of the subject sequence 
which are not matched/aligned with the query sequnce are manually corrected 
for. No other manual corrections are to made for the purposes of the present 
invention. 

25 The present application is directed to nucleic acid molecules at least 

90%, 95%, 96%, 97%, 98% or 99% identical to (or at most 10%, 5%, 4%, 
3%, 2% or 1% different from) the nucleic acid sequence shown in Figures 1A 
and IB (SEQ ID NO:l) or to the nucleic acid sequence of the deposited cDNA, 
irrespective of whether they encode a polypeptide having HLF activity. This is 

30 because even where a particular nucleic acid molecule does not encode a 

polypeptide having HLF activity, one of skill in the an would still know how to 
use the nucleic acid molecule, for instance, as a hybridization probe or a 
polymerase chain reaction (PCR) primer. Uses of the nucleic acid molecules of 
the present invention that do not encode a polypeptide having HLF activity 

35 include, inter alia, (1) isolating the HLF gene or allelic variants thereof in a 

cDNA library; (2) in situ hybridization (e.g., "FISH") to metaphase 
chromosomal spreads to provide precise chromosomal location of the HLF 
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gene, as described in Verma et ah. Human Chromosomes: A Manual of Basic 
Techniques, Pergamon Press, New York (1988); and Northern Blot analysis 
for detecting HLF mRNA expression in specific tissues. 

Preferred, however, are nucleic acid molecules having sequences at least 
90%, 95%, 96%, 97%, 98% or 99% identical to (or at most 10%, 5%, 4%, 
3%, 2% or 1% different from)the nucleic acid sequence shown in Figures 1A 
and 1 B (SEQ ID NO: 1 ) or to the nucleic acid sequence of the deposited cDN A 
which do. in fact, encode a polypeptide having HLF protein activity. By "a 
polypeptide having HLF activity" is intended polypeptides exhibiting activity 
similar, but not necessarily identical, to an activity of the full-length or soluble 
EGF domain of the HLF protein of the invention, as measured in a particular 
biological assay. For example, the HLF protein of the present invention can be 
assayed for activity by analyzing changes in the phosphorylation state cell 
surface receptors. As described by Marchionni and colleagues (Nature 
362:312-318; 1993), a tyrosine kinase activation assay may be used to 
determine such activity. In this assay, a wide variety of cells and cell lines are 
allowed to become quiescent in low serum medium. HLF protein, or variants 
thereof, may then be added exogenously to the growth medium. Cultured cells 
are then lysed in an SDS-based lysis buffer and subject to SDS-PAGE. The 
proteins separated by SDS-PAGE are then transfered to a membrane and 
immunoblotted with an anti-phosphotyrosine antibody. Changes in tyrosine 
phosphor lation state of cell surface receptor molecules may then be assessed 
by comparing immunoblots of cell samples which were treated or not treated 
with HLF. or a variant thereof. Such activity is useful for determining the affect 
of HLF, or variants thereof, on the stimulation of a wide variety of cell surface 
receptor molecules and determining which signal transduction cascades may be 
initiated by the binding of HLF, or a variant thereof. 

HLF protein binding modulates the tyrosine phosphorylation state and 
initiates a variety of signal transduction cascades in erbB family or other cell 
surface receptor molecules in a dose-dependent manner in the above-described 
assay. Thus, "a polypeptide having HLF protein activity" includes polypeptides 
that also exhibit any of the same binding and phosphorylation state altering 
activities in the above-described assays in a dose-dependent manner. Although 
the degree of dose-dependent activity need not be identical to that of the HLF 
protein, preferably, "a polypeptide having HLF protein activity" will exhibit 
substantially similar dose-dependence in a given activity as compared to the 
HLF protein (i.e.. the candidate polypeptide will exhibit greater activity or not 
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more than about 25-fold less and, preferably, not more than about tenfold less 
activity relative to the reference HLF protein). 

In addition, the binding of HLF, or variants thereof, to a particular 
receptor molecule may be assayed by cross-linking HLF, or an HLF variant, to 
5 whichever receptor it binds on the cell surface and then immunoprecipitating the 

resulting ligand/receptor complex with a specific antiserum. Such an assay will 
thereby indicate a specific receptor binding profile for the HLF protein(s). As 
described by Holmes and colleagues (Science 256:1205-1210; 1992), 12S 1- 
labeled HLF, or HLF variant, protein is cross-linked to any of a variety of cells 

10 or cell lines by incubating a suspension of cells and 10 {} CPM of ^-labeled 

HLF, or HLF variant, proteins in Hank's balanced salts (Life Technologies, 
Inc., Rockville, MD) for 30 minutes at 22°C. Bis(sulfosuccinimidyl) suberate 
is added to the cell suspensions to a final concentration of 1 mM and the 
suspensions are incubated for an additional 30 minutes. Cells are washed 

15 Tris-buffered saline (TBS) and then lysed in TBS containing Triton X-100 

(0.5%). Immunoprecipitations are then performed using portions of treated and 
mock-treated cultures combined with antisera to specific cellular receptor 
molecules. Samples are then prepared in SDS sample buffer, analyzed by 
SDS-PAGE (5.5% polyacrylamide gels), and visualized by autoradiography. 

20 Of course, due to the degeneracy of the genetic code, one of ordinary 

skill in the art will immediately recognize that a large number of the nucleic acid 
molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% 
identical to (or at most 10%, 5%, 4%, 3%, 2% or 1% different from) the 
nucleic acid sequence of the deposited cDNA or the nucleic acid sequence 

25 shown in Figures I A and IB (SEQ ID NO:l) will encode a polypeptide "having 

HLF protein activity." In fact, since degenerate variants of these nucleotide 
sequences all encode the same polypeptide, this will be clear to the skilled 
artisan even without performing the above described comparison assay. It will 
be further recognized in the art that, for such nucleic acid molecules that are not 

30 degenerate variants, a reasonable number will also encode a polypeptide having 

HLF protein activity. This is because the skilled artisan is fully aware of amino 
acid substitutions that are either less likely or not likely to significantly effect 
protein function (e.g., replacing one aliphatic amino acid with a second aliphatic 
amino acid), as further described below. 
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Vectors and Host Cells 

The present invention also relates to vectors which include the isolated 
DNA molecules of the present invention, host cells which are genetically 
engineered with the recombinant vectors, and the production of HLF 
polypeptides or fragments thereof by recombinant techniques. The vector may 
be, for example, a phage, plasmid, viral or retroviral vector. Retroviral vectors 
may be replication competent or replication defective. In the latter case, viral 
propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable 
marker for propagation in a host. Generally, a plasmid vector is introduced in a 
precipitate, such as a calcium phosphate precipitate, or in a complex with a 
charged lipid. If the vector is a virus, it may be packaged in vitro using an 
appropriate packaging cell line and then transduced into host cells. 

In one embodiment, the DNA of the invention is operatively associated 
with an appropriate heterologous regulatory element (e.g. a promoter and/or 
enhancer), such as the phage lambda PL promoter, the E. coli lac, trp, phoA 
and tac promoters, the SV40 early and late promoters and promoters of 
retroviral LTRs, to name a few. Other suitable promoters and enhancers will be 
known to the skilled artisan. 

In embodiments in which vectors contain expression constructs, these 
constructs will further contain sites for transcription initiation, termination and, 
in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, 
UGA or UAG) appropriately positioned at the end of the polypeptide to be 
translated. 

As indicated, the expression vectors will preferably include at least one 
selectable marker. Such markers include dihydrofolate reductase, G418 or 
neomycin resistance for eukaryotic cell culture and tetracycline, kanamycin or 
ampicillin resistance genes for culturing in E. coli and other bacteria. 
Representative examples of appropriate hosts include, but are not limited to, 
bacterial cells, such as E. coli, Streptomyces and Salmonella ryphimuritmi cells; 
fungal cells, such as yeast cells; insect cells such as Drosophila S2 and 
Spodoptera Sf9 cells; animal cells such as CHCX COS, 293 and Bowes 
melanoma cells; and plant cells. Appropriate culture mediums and conditions 
for the above-described host cells are known in the art. 



Among vectors preferred for use in bacteria include pQE70, pQE60 and 
pQE-9, available from QIAGEN, Inc., supra; pBS vectors, Phagescript vectors, 
Blucscript vectors, pNHSA, pNH16a, pNH18A, pNH46A, available from 
Stratagene: and ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available 
from Pharmacia. Among preferred eukaryotic vectors are pWLNEO, 
pSV2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3, 
pBPV, pMSG and pSVL available from Pharmacia. Other suitable vectors will 
be readily apparent to the skilled artisan. 

Introduction of the construct into the host cell can be effected by calcium 
phosphate transfection, DEAE-dextran mediated transfection, cationic 
lipid-mediated transfection. electroporation, transduction, infection or other 
methods. Such methods are described in many standard laboratory manuals, 
such as Davis et aL, Basic Methods In Molecular Biology ( 1 986). 

The polypeptide may be expressed in a modified form, such as a fusion 
protein (comprising the polypeptide joined via a peptide bond to a heterologous 
protein sequence (of a different protein)), and may include not only secretion 
signals, but also additional heterologous functional regions. Such a protein can 
be made by ligating polynucleotides of the invention and the desired nucleic acid 
sequence encoding the desired amino acid sequence to each other, by methods 
known in the art, in the proper reading frame, and expressing the fusion protein 
product by methods known in the art. Alternatively, such a fusion protein can 
be made by protein synthetic techniques, e.g. by use of a peptide synthesizer. 
For instance, a region of additional amino acids, particularly charged amino 
acids, may be added to the N-terminus of the polypeptide to improve stability 
and persistence in the host cell, during purification, or during subsequent 
handling and storage. Also, peptide moieties may be added to the polypeptide 
to facilitate purification. Such regions may be removed prior to final preparation 
of the polypeptide. The addition of peptide moieties to polypeptides to engender 
secretion or excretion, to improve stability and to facilitate purification, among 
others, are familiar and routine techniques in the art. A preferred fusion protein 
comprises a heterologous region from immunoglobulin that is useful to stabilize 
and purify proteins. For example, EP-A-O 464 533 (Canadian counterpart 
2045869) discloses fusion proteins comprising various portions of constant 
region of immunoglobulin molecules together with another human protein or 
part thereof. In many cases, the Fc part in a fusion protein is thoroughly 
advantageous for use in therapy and diagnosis and thus results, for example, in 
improved pharmacokinetic properties (EP-A 0232 262). On the other hand, for 



some uses it would be desirable to be able to delete the Fc part after the fusion 
protein has been expressed, detected and purified in the advantageous manner 
described. This is the case when Fc portion proves to be a hindrance to use in 
therapy and diagnosis, for example when the fusion protein is to be used as 
antigen for immunizations. In drug discovery, for example, human proteins, 
such as hIL-5, have been fused with Fc portions for the purpose of 
high-throughput screening assays to identify antagonists of hIL-5. See, D. 
Bennett et al., 7. Molecular Recognition 8:52-58 (1995) and K. Johanson etal., 
J. Biol. Chem. 270:9459-9471 (1995). 

The HLF protein can be recovered and purified from recombinant cell 
cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, 
affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Most preferably, high performance liquid chromatography 
CHPLC") is employed for purification. Polypeptides of the present invention 
include: products purified from natural sources, including bodily fluids, tissues 
and cells, whether directly isolated or cultured; products of chemical synthetic 
procedures; and products produced by recombinant techniques from a 
prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher 
plant, insect and mammalian cells. Depending upon the host employed in a 
recombinant production procedure, the polypeptides of the present invention 
may be glycosylated or may be non-glycosylated. In addition, polypeptides of 
the invention may also include an initial modified methionine residue, in some 
cases as a result of host-mediated processes. Thus, it is well known in the art 
that the N-terminal methionine encoded by the translation initiation codon 
generally is removed with high efficiency from any protein after translation in all 
eukaryotic cells. While the N-terminal methionine on most proteins also is 
efficiently removed in most prokaryotes, for some proteins this prokaryotic 
removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 

Polypeptides and Fragments 

The invention further provides an isolated HLF polypeptide having the 
amino acid sequence encoded by the deposited cDNA, or the amino acid 



WO 98/57989 



PCT/US98/12403 



25 

sequence in SEQ ID NO:2, or a peptide or polypeptide comprising a portion of 
the above polypeptides. 

Variant and Mutant Polypeptides 

To improve or alter the characteristics of HLF polypeptides, protein 
engineering may be employed. Recombinant DNA technology known to those 
skilled in the art can be used to create novel mutant proteins or "muteins 
including single or multiple amino acid substitutions, deletions, additions or 
fusion proteins. Such modified polypeptides can show, e.g., enhanced activity 
or increased stability. In addition, they may be purified in higher yields and 
show better solubility than the corresponding natural polypeptide, at least under 
certain purification and storage conditions. 

N-Terminat and C-Terminal Deletion Mutants 

For instance, for many proteins, including the extracellular domain of a 
membrane associated protein or the mature form(s) of a secreted protein, it is 
known in the art that one or more amino acids may be deleted from the 
N-terminus or C-terminus without substantial loss of biological function. For 
instance. Ron and colleagues (J. BioL Chem., 268:2984-2988; 1993) reported 
modified KGF proteins that had heparin binding activity even if 3, 8, or 27 
amino-terminal amino acid residues were missing. In the present case, since the 
protein of the invention is a member of the EGF polypeptide family, deletions of 
N-terminal amino acids up to the cysteine at position 35 of SEQ ID NO:2 may 
retain some biological activity such as receptor binding and the inititation of the 
corresponding signal transduction cascade. Polypeptides having further 
N-terminal deletions including the cysteine residue at position 35 of SEQ ID 
NO:2 would not be expected to retain such biological activities because it is 
known that this residue in an EGF-like, or heregulin, polypeptide is one of six 
conserved cysteine residues required for both structure and biological activity. 
That is to say, the first cysteine is required for forming one of several disulfide 
bridges needed to provide structural stability which is, in turn, necessary for 
receptor binding and the inititiation of the signal transduction cascade. 

However, even if deletion of one or more amino acids from the N- 
terminus of a protein results in modification of loss of one or more biological 
functions of the protein, other biological activities may still be retained. Thus, 
the ability of the shortened protein to induce and/or bind to antibodies which 
recognize the complete or extracellular domain of the protein generally will be 
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retained when less than the majority of the residues of the complete or 
extracellular domain of the protein are removed from the N-terminus. Whether 
a particular polypeptide lacking N-terminal residues of a complete protein retains 
such immunologic activities can readily be determined by routine methods 
described herein and otherwise known in the an. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the amino terminus of the amino acid 
sequence of HLF shown in SEQ ID NO:2, up to the cysteine residue at position 
number 35, and polynucleotides encoding such polypeptides. In particular, the 
present invention provides polypeptides comprising the amino acid sequence of 
residues n-35 of SEQ ID NO:2, where n is an integer in the range of 1-35, and 
35 is the position of the first residue from the N-terminus of the complete HLF 
polypeptide (shown in. SEQ ID NO:2) believed to be required for the HLG 
protein to bind its receptor and initiate the corresponding signal transduction 
cascade. 

More in particular, the invention provides polynucleotides encoding 
fH»ly peptides comprising, or alternatively consisting of, the amino acid sequence 
*ii residues of 1-157, 2-157, 3-157, 4-157, 5-157, 6-157, 7-157, 8-157, 9- 
157, 10-157, 11-157, 12-157, 13-157, 14-157, 15-157, 16-157, 17-157, 18- 
157. 19-157, 20-157, 21-157, 22-157, 23-157, 24-157, 25-157, 26-157, 27- 
157. 28-157, 29-157, 30-157, 31-157, 32-157, 33-157, 34-157, and 35-157 of 
SEQ ID NO:2. Polynucleotides encoding these polypeptides also are provided. 

Similarly, many examples of biologically functional C-terminal deletion 
mute ins arc known. For instance, Interferon-gamma shows up to ten times 
higher activities by deleting 8-10 amino acid residues from the carboxy terminus 
of the protein (Dobeli et aL, 7. Biotechnology 7: 199-216; 1988). In the present 
case, since the protein of the invention is a member of the EGF or heregulin-like 
polypeptide family, deletions of C-terminal amino acids up to the most 
carboxy-terminal cysteine of the extracellular domain (position 73 of SEQ ID 
NO:2) may retain some biological activity such as such as receptor binding and 
the initiation of the corresponding signal transduction cascade. Polypeptides 
having further C-terminal deletions including the cysteine residue at position 73 
of SEQ ID NO:2 would not be expected to retain such biological activities 
because it is known that this residue in an EGF-like, or heregulin-like, 
polypeptide is one of six conserved cysteine residues required for both structure 
and biological activity. That is to say, the first cysteine is required for forming 
one of several disulfide bridges needed to provide structural stability which is, 
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in turn, necessary for receptor binding and the inititiation of the signal 

transduction cascade. 

However, even if deletion of one or more amino acids from the 

C-terminus of a protein results in modification of loss of one or more biological 
5 functions of the protein, other biological activities may still be retained. Thus, 

the ability of the shortened protein to induce and/or bind to antibodies which 

recognize the complete or extracellular domain of the protein generally will be 

retained when less than the majority of the residues of the complete or 

extracellular domain protein are removed from the C-terminus. Whether a 
10 particular polypeptide lacking C-terminal residues of a complete protein retains 

such immunologic activities can readily be determined by routine methods 

described herein and otherwise known in the art. 

Accordingly, the present invention further provides polypeptides having 

one or more residues from the carboxy terminus of the amino acid sequence of 
15 the HLF shown in SEQ ID NO:2, up to the cysteine residue at position 73 of 

SEQ ID NO:2, and polynucleotides encoding such polypeptides. In particular, 

the present invention provides polypeptides having the amino acid sequence of 

residues I-m of the amino acid sequence in SEQ ID NO:2. where m is any 

integer in the range of 73 to 101, and residue 73 is the position of the first 
20 residue from the C- terminus of the complete HLF polypeptide (shown in SEQ 

ID NO:2) believed to be required for receptor binding and intitiation of the 

corresponding signal transduction cascade. 

More in particular, the invention provides polynucleotides encoding 

polypeptides comprising, or alternatively consisting of, the amino acid sequence 
25 of residues 1-73, 1-74, 1-75, 1-76, 1-77, 1-78, 1-79, 1-80, 1-81, 1-82, 

1-83, 1-84, 1-85, 1-86, 1-87, 1-88, 1-89, 1-90, 1-91, 1-92, 1-93, 

1-94, 1-95, 1-96, 1-97, 1-98, 1-99, I -100, and 1-101 of SEQ ID NO:2. 

Polynucleotides encoding these polypeptides also are provided. 

The invention also provides polypeptides having one or more amino 
30 acids deleted from both the amino and the carboxyl termini, which may be 

described generally as having residues n-m of SEQ ID NO:2, where n and m are 

integers as described above. 

Also included are a nucleotide sequence encoding a polypeptide 

consisting of a portion of the complete HLF amino acid sequence encoded by 
35 the cDNA clone contained in ATCC Deposit No. 209123, where this portion 

excludes from J to about 34 amino acids from the amino terminus of the 

complete amino acid sequence encoded by the cDNA clone contained in ATCC 
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Deposit No. 209123, or from 1 to about 83 amino acids from the carboxy 
terminus, or any combination of the above amino terminal and carboxy terminal 
deletions, of the complete amino acid sequence encoded by the cDNA clone 
contained in ATCC Deposit No. 209123. Polynucleotides encoding all of the 
above deletion mutant polypeptide forms also are provided. 

As mentioned above, even if deletion of one or more amino acids from 
the N-terminus of a protein results in modification of loss of one or more 
biological functions of the protein, other biological activities may still be 
retained. Thus, the ability of the shortened HLF mutein to induce and/or bind to 
antibodies which recognize the complete or mature of the protein generally will 
be retained when less than the majority of the residues of the complete or mature 
protein are removed from the N-terminus. Whether a particular polypeptide 
lacking N-terminal residues of a complete protein retains such immunologic 
activities can readily be determined by routine methods described herein and 
otherwise known in the art. It is not unlikely that an HLF mutein with a large 
number of deleted N-terminal amino acid residues may retain some biological or 
immungenic activities. In fact, peptides composed of as few as six HLF amino 
acid residues may often evoke an immune response. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the amino terminus of the amino acid 
sequence of the HLF shown in SEQ ID NO:2, up to the asparagine residue at 
position number 152 and polynucleotides encoding such polypeptides. In 
particular, the present invention provides polypeptides comprising the amino 
acid sequence of residues n'-152 of SEQ ID NO:2, where n* is an integer in the 
range of 2-152, and 153 is the position of the first residue from the N-terminus 
of the complete HLF polypeptide believed to be required for at least 
immunogenic activity of the HLF protein. 

More in particular, the invention provides polynucleotides encoding 
polypeptides comprising, or alternatively consisting of, the amino acid sequence 
of residues of S-2 to K-157; S-3 to K-157; S-4 to K-157; S-5 to K-157; A-6 to 
K-157; T-7 to K-157; T-8 to K-157; T-9 to K-157; T-10 to K-157; P-l 1 to 
K-157; E-12 to K-157; T-13 to K-157; S-14 to K-157; T-15 to K-157; S-16 to 
K-157; P-17 to K-157; K-18 to K-157; F-19 to K-157; H-20 to K-157; T-21 to 
K-157; T-22 to K-157; T-23 to K-157; Y-24 to K-157; S-25 to K-157; T-26 to 
K-157; E-27 to K-157; R-28 to K-157; S-29 to K-157; E-30 to K-157; H-31 to 
K-157; F-32 to K-157; K-33 to K-157; P-34 to K-157; C-35 to K-157; R-36 to 
K-157; D-37 to K-157; K-38 to K-157; D-39 to K-157; L-40 to K-157; A-41 to 
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K-157; Y-42 to K-157; C-43 to K-157; L-44 to K-I57: N-45 to K-157: D-46 to 
K-157; G-47 to K-157; E-48 to K-157; C-49 to K-157: F-50 to K-157; V-51 to 
K-157; 1-52 to K-157; E-53 to K-157: T-54 to K-157; L-55 to K-157; T-56 to 
K-157; G-57 to K-157; S-58 to K-157; H-59 to K-157; K-60 to K-157; H-61 
5 to K-157; C-62 to K-157; R-63 to K-157; C-64 to K-157: K-65 to K-157; E-66 

to K-157; G-67 to K-157; Y-68 to K-157; Q-69 to K-157; G-70 to K-157; V-71 
to K-157; R-72 to K-157; C-73 to K-157; D-74 to K-157; Q-75 to K-157; F-76 
to K-157; L-77 to K-157; P-78 to K-157; K-79 to K-157; T-80 to K-157; D-81 
to K-157; S-82 to K-157; 1-83 to K-157; L-84 to K-157; S-85 to K-157; D-86 
10 to K-157; P-87 to K-157; N-88 to K-157; H-89 to K-157: L-90 to K-157; G-91 

to K-157; 1-92 to K-157; E-93 to K-157; F-94 to K-157; M-95 to K-157; E-96 
to K-157: S-97 to K-157; E-98 to K-157; E-99 to K-157; V-100 to K-157; 
Y-101 to K-157; Q-102 to K-157: R-103 to K-157; Q-104 to K-157: V-105 to 
K-157; L-106 to K-157; S-107 to K-157; 1-108 to K-157: S-109 to K-157; 
15 C-J 10 to K-157; 1-1 1 1 to K-157; 1-1 12 to K-157: F-I 13 to K-157; G-l 14 to 

K-157; 1-115 to K-157; V-l 16 to K-157; 1-1 17 to K-157; V-l 18 to K-157; 
G-l 19 to K-157: M-120 to K-157: F-121 to K-157; C-122 to K-157; A-123 to 
K-157; A- 124 to K-157; F-125 to K-157; Y-126 to K-157; F- 127 to K-157; 
K-128 to K-157; S-129 to K-157; K-130 to K-157; R-131 to K-157; N-132 to 
20 K-157; 1-133 to K-157; T-134 to K-157; A- 135 to K-157; N-136 to K-157; 

S-137 to K-157; V-138 to K-157: S-139 to K-157; E-140 to K-157: E-141 to 
K-157; R-142 to K-157; W-143 to K-157: K-144 to K-157: G-145 to K-157; 
L-146 to K-157; P-147 to K-157; S-148 to K-157; Q-149 to K-157; E-150 to 
K-157; P-151 to K-157; and N-152 to K-157 of the HLF sequence shown in 
25 SEQ ID NO:2. Polynucleotides encoding these polypeptides are also 

encompassed by the invention. 

Also as mentioned above, even if deletion of one or more amino acids 
from the C-terminus of a protein results in modification of loss of one or more 
biological functions of the protein, other biological activities may still be 
30 retained. Thus, the ability of the shortened HLF mutein to induce and/or bind to 

antibodies which recognize the complete or mature of the protein generally will 
be retained when less than the majority of the residues of the complete or mature 
protein are removed from the C-terminus. Whether a particular polypeptide 
lacking C-terminal residues of a complete protein retains such immunologic 
35 activities can readily be determined by routine methods described herein and 

otherwise known in the art. It is not unlikely that an HLF mutein with a large 
number of deleted C-terminal amino acid residues may retain some biological or 
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immungenic activities. In fact, peptides composed of as few as six HLF amino 
acid residues may often evoke an immune response. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the carboxy terminus of the amino acid 
sequence of the HLF shown in SEQ ID NO:2, up to the alanine residue at 
position number 6, and polynucleotides encoding such polypeptides. In 
particular, the present invention provides polypeptides comprising the amino 
acid sequence of residues 1-m' of SEQ ID NO:2, where m' is an integer in the 
range of 7-156, and 6 is the position of the first residue from the C-terminus of 
the complete HLF polypeptide believed to be required for at least immunogenic 
activity of the HLF protein. 

More in particular, the invention provides polynucleotides encoding 
polypeptides comprising, or alternatively consisting of, the amino acid sequence 
of residues S-l to D-156; S-l to Q-155; S-l to Q-154; S-l to L-153; S-l to 
N-152; S-l to P-151; S-l to E-150; S-l to Q-149; S-l to S-148; S-l to P-147; 
S-l to L-146; S-l to G-145; S-l to K-144; S-l to W-143; S-l to R-142; S-l to 
E-141 ; S-l toE-140; S-l to S-139; S-l to V-138;S-1 toS-137; S-l toN-136; 
S-l to A-135;S-1 toT-134; S-l to 1-133; S-l toN-132;S-l toR-131;S-l to 
K-130; S-l toS-129; S-l to K-128; S-l toF-127: S-l to Y-126; S-l toF-125; 
S-l toA-.124;S-l toA-123;S-l toC-122; S-l toF-121;S-l toM-120; S-l to 
G-l 19; S-l to V-l 18; S-l to I- 1 17; S-l toV-116; S-l to 1-115; S-l toG-114; 
S-l to F-l 13; S-l to 1-112; S-l to 1-1 1 1; S-l toC-110;S-l toS-109; S-l to 
1-108; S-l toS-107;S-l toL-106;S-l toV-105;S-l toQ-104; S-l to R-103; 
S-l toQ-102;S-l to Y-101;S-1 toV-100; S-l to E-99; S-l toE-98;S-l to 
S-97;S-1 toE-96;S-l toM-95;S-l toF-94;S-l toE-93;S-l to 1-92; S-l to 
G-91; S-l to L-90; S-l to H-89; S-l to N-88; S-l to P-87; S-l to D-86; S-l to 
S-85-.S-1 to L-84;S-1 to 1-83; S-l toS-82;S-l toD-81;S-l to T-80; S-l to 
K-79; S- 1 to P-78; S- 1 to L-77; S- 1 to F-76; S- 1 to Q-75; S- 1 to D-74; S- 1 to 
C-73;S-1 toR-72;S-l toV-71;S-l toG-70;S-l toQ-69;S-l toY-68;S-l to 
G-67;S-1 to E-66;S-1 toK-65;S-l to C-64; S-l toR-63;S-l toC-62;S-l to 
H-6 1 ; S- 1 to K-60; S- 1 to H-59; S- 1 to S-58; S- 1 to G-57; S- 1 to T-56; S- 1 to 
L-55;S-1 toT-54;S-l toE-53;S-l to 1-52; S-l toV-51;S-l to F-50; S-l to 
C-49;S-1 toE-48;S-l to G-47; S-l toD-46;S-l toN-45;S-l toL-44;S-l to 
C-43; S-l to Y-42; S-l to A-41; S-l to L-40; S-l to D-39; S-l to K-38; S-l to 
D-37; S-l to R-36; S-l to C-35; S-l to P-34; S-l to K-33; S-l to F-32; S-l to 
H-31; S-l to E-30; S-l to S-29; S-l to R-28; S-l to E-27; S-l to T-26; S-l to 
S-25;S-1 to Y-24; S-l toT-23;S-l to T-22; S-l toT-21;S-l to H-20; S-l to 
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F-19;S-1 to K-18;S-1 to P-17;S-1 toS-16;S-l toT-15;S-l toS-14; S-l to 
T-13;S-1 to E-12; S-l to P- 1 1 ; S- i toT-10; S-l toT-9;S-l toT-8;S-l toT-7; 
S-l to A-6 of the HLF sequence shown in SEQ ID NO:2. Polynucleotides 
encoding these polypeptides also are provided. 
5 The invention also provides polypeptides having one or more amino 

acids deleted from both the amino and the carboxyl termini of an HLF 
polypeptide, which may be described generally as having residues n'-m' of 
SEQ ID NO:2, where n' and m' are integers as described above. 

Also as mentioned above, even if deletion of one or more amino acids 

10 from the N-terminus of a protein results in modification of loss of one or more 

biological functions of the protein, other biological activities may still be 
retained. Thus, the ability of the shortened HLF mutein to induce and/or bind to 
antibodies which recognize the complete or mature of the protein generally will 
be retained when less than the majority of the residues of the complete or mature 

15 protein are removed from the N-terminus. Whether a particular polypeptide 

lacking N-terminal residues of a complete protein retains such immunologic 
activities can readily be determined by routine methods described herein and 
otherwise known in the art. It is not unlikely that an HLF mutein with a large 
number of deleted N-terminal amino acid residues may retain some biological or 

20 immungenic activities. In fact, peptides composed of as few as six HLF amino 

acid residues may often evoke an immune response. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the amino terminus of the amino acid 
sequence of the HLF shown in SEQ ID NO:22, up to the aspartic acid residue at 

25 position number 715 and polynucleotides encoding such polypeptides. In 

particular, the present invention provides polypeptides comprising the amino 
acid sequence of residues n"-715 of SEQ ID NO:22, where n" is an integer in 
the range of 2-715, and 716 is the position of the first residue from the 
N-terminus of the complete HLF polypeptide believed to be required for at least 

30 immunogenic activity of the HLF protein. 

More in particular, the invention provides polynucleotides encoding 
polypeptides comprising, or alternatively consisting of, the amino acid sequence 
of residues of S-2 to K-720; E-3 to K-720; G-4 to K-720; A-5 to K-720; A-6 to 
K-720; A-7 to K-720; A-8 to K-720; S-9 to K-720; P-10 to K-720; P-l 1 to 

35 K-720; G-12 to K-720; A- 13 to K-720; A- 14 to K-720; S-l 5 to K-720; A- 16 to 

K-720; A- 17 to K-720; A- 18 to K-720; A- 19 to K-720; S-20 to K-720; A-21 to 
K-720; E-22 to K-720; E-23 to K-720: G-24 to K-720: T-25 to K-720; A-26 to 
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K-720; A-27 to K-720; A-28 to K-720; A-29 to K-720; A-30 to K-720; A-31 to 
K-720; A-32 to K-720: A-33 to K-720; A-34 to K-720; G-35 to K-720; G-36 to 
K-720; G-37 to K-720; P-38 to K-720; D-39 to K-720; G-40 to K-720; G-41 to 
K-720; G-42 to K-720; E-43 to K-720; G-44 to K-720; A-45 to K-720; A-46 to 
5 K-720; E-47 to K-720; P-48 to- K-720; P-49 to K-720: R-50 to K-720; E-5 1 to 

K-720; L-52 to K-720; R-53 to K-720; C-54 to K-720: S-55 to K-720; D-56 to 
K-720; C-57 to K-720; 1-58 to K-720; V-59 to K-720; W-60 to K-720; N-61 to 
K-720; R-62 to K-720; Q-63 to K-720: Q-64 to K-720; T-65 to K-720; W-66 to 
K-720: L-67 to K-720: C-68 to K-720; V-69 to K-720; V-70 to K-720; P-71 to 
10 K-720; L-72 to K-720; F-73 to K-720; 1-74 to K-720; G-75 to K-720; F-76 to 

K-720; 1-77 to K-720; G-78 to K-720; L-79 to K-720; G-80 to K-720; L-81 to 
K-720; S-82 to K-720; L-83 to K-720; M-84 to K-720; L-85 to K-720; L-86 to 
K-720; K-87 to K-720: W-88 to K-720; 1-89 to K-720; V-90 to K-720; V-91 to 
K-720; G-92 to K-720; S-93 to K-720; V-94 to K-720; K-95 to K-720; E-96 to 
1 5 K-720; Y-97 to K-720: V-98 to K-720; P-99 to K-720; T- 1 00 to K-720; D- 1 0 1 

to K-720; L-102 to K-720; V-103 to K-720; D- 104 to K-720; S-105 to K-720; 
K-106 to K-720; G-107 to K-720: M-108 to K-720; G-109 to K-720; Q-l 10 to 
K-720; D-l 1 1 to K-720; P-112 to K-720; F-l 13 to K-720; F-l 14 to K-720; 
L-l 15 to K-720; S-l 16 to K-720; K-l 17 to K-720; P-l 18 to K-720; S-l 19 to 
20 K-720; S-l 20 to K-720; F- 121 to K-720; P-l 22 to K-720; K-l 23 to K-720; 

A- 1 24 to K-720; M- 1 25 to K-720; E- 1 26 to K-720; T- 1 27 to K-720; T- 1 28 to 
K-720; T- 1 29 to K-720; T- 1 30 to K-720; T- 1 3 1 to K-720; T- 1 32 to K-720; 
S-l 33 to K-720; T-134 to K-720; T-135 to K-720; S-l 36 to K-720; P-l 37 to 
K-720; A- 138 to K-720; T-139 to K-720; P-140 to K-720; S- 141 to K-720; 
25 A- 1 42 to K-720; G- 1 43 to K-720; G- 1 44 to K-720; A- 1 45 to K-720; A- 1 46 to 

K-720; S-147 to K-720; S-148 to K-720; R-149 to K-720; T-150 to K-720; 
P-151 to K-720; N-152 to K-720; R-153 to K-720; 1-154 to K-720; S-l 55 to 
K-720; T-156 to K-720; R-157 to K-720; L-l 58 to K-720: T-I59 to K-720; 
T-160 to K-720; 1-161 to K-720; T-162 to K-720; R-163 to K-720; A- 164 to 
30 K-720; P- 1 65 to K-720; T- 1 66 to K-720; R- 1 67 to K-720; F- 1 68 to K-720; 

P- 1 69 to K-720; G- 1 70 to K-720; H- 1 7 1 to K-720; R- 1 72 to K-720; V- 1 73 to 
K-720; P-l 74 to K-720; 1-175 to K-720; R-176 to K-720; A- 177 to K-720; 
S-l 78 to K-720; P-l 79 to K-720; R-180 to K-720; S-181 to K-720; T-182 to 
K-720; T-183 to K-720; A- 184 to K-720; R-185 to K-720; N-186 to K-720; 
35 T-187 to K-720; A-188 to K-720: A-189 to K-720; P-190 to K-720; A-191 to 

K-720; T-192 to K-720; V-193 to K-720; P-l 94 to K-720; S-l 95 to K-720; 
T-196 to K-720: T-197 to K-720: A- 198 to K-720; P-l 99 to K-720; F-200 to 



WO 98/57989 PCT7US98/12403 



K-720; F-201 to K-720; S-202 to K-720: S-203 to K-720; S-204 to K-720; 
T-205 to K-720; L-206 to K-720; G-207 to K-720; S-208 to K-720; R-209 to 
K-720; P-210 to K-720; P-21 1 to K-720; V-212 to K-720; P-213 to K-720; 
G-214 to K-720; T-215 to K-720; P-21 6 to K-720; S-217 to K-720; T-218 to 
5 K-720; Q-2 1 9 to K-720; A-220 to K-720; M-22 1 to K-720; P-222 to K-720; 

S-223 to K-720; W-224 to K-720; P-225 to K-720; T-226 to K-720; A-227 to 
K-720; A-228 to K-720; Y-229 to K-720; A-230 to K-720; T-23 1 to K-720: 
S-232 to K-720; S-233 to K-720; Y-234 to K-720: L-235 to K-720; H-236 to 
K-720; D-237 to K-720; S-238 to K-720; T-239 to K-720: P-240 to K-720; 
10 S-241 to K-720; W-242 to K-720: T-243 to K-720; L-244 to K-720: S-245 to 

K-720; P-246 to K-720; F-247 to K-720; Q-248 to K-720; D-249 to K-720: 
A-250 to K-720; A-251 to K-720; S-252 to K-720: S-253 to K-720; S-254 to 
K-720; S-255 to K-720; S-256 to K-720; S-257 to K-720: S-258 to K-720: 
S-259 to K-720; S-260 to K-720: S-261 to K-720; T-262 to K-720; T-263 to 

1 5 K-720; T-264 to K-720; T-265 to K-720; P-266 to K-720; E-267 to K-720; 

T-268 to K-720; S-269 to K-720; T-270 to K-720; S-27 1 to K-720: P-272 to 
K-720; K-273 to K-720; F-274 to K-720; H-275 to K-720: T-276 to K-720; 
T-277 to K-720; T-278 to K-720; Y-279 to K-720; S-280 to K-720; T-281 to 
K-720; E-282 to K-720: R-283 to K-720: S-284 to K-720; E-285 to K-720; 

20 H-286 to K-720; F-287 to K-720; K-288 to K-720: P-289 to K-720: C-290 to 

K-720; R-291 to K-720; D-292 to K-720: K-293 to K-720; D-294 to K-720; 
L-295 to K-720; A-296 to K-720; Y-297 to K-720; C-298 to K-720; L-299 to 
K-720; N-300 to K-720; D-301 to K-720: G-302 to K-720: E-303 to K-720; 
C-304 to K-720; F-305 to K-720; V-306 to K-720; 1-307 to K-720: E-308 to 

25 K-720; T-309 to K-720; L-3 1 0 to K-720; T-3 1 1 to K-720; G-3 1 2 to K-720; 

S-313 to K-720; H-314 to K-720; K-315 to K-720; H-316 to K-720: C-3I7 to 
K-720; R-318 to K-720; C-319 to K-720; K-320 to K-720: E-321 to K-720; 
G-322 to K-720; Y-323 to K-720; Q-324 to K-720; G-325 to K-720; V-326 to 
K-720; R-327 to K-720; C-328 to K-720; D-329 to K-720; Q-330 to K-720; 

30 F-33 1 to K-720: L-332 to K-720; P-333 to K-720; K-334 to K-720; T-335 to 

K-720; D-336 to K-720; S-337 to K-720; 1-338 to K-720: L-339 to K-720; 
S-340 to K-720; D-341 to K-720; P-342 to K-720; T-343 to K-720: D-344 to 
K-720; H-345 to K-720; L-346 to K-720: G-347 to K-720; 1-348 to K-720; 
E-349 to K-720; F-350 to K-720; M-351 to K-720; E-352 to K-720: S-353 to 

35 K-720; E-354 to K-720: E-355 to K-720; V-356 to K-720: Y-357 to K-720; 

Q-358 to K-720: R-359 to K-720; Q-360 to K-720; V-361 to K-720; L-362 to 
K-720: S-363 to K-720; 1-364 to K-720; S-365 to K-720: C-366 to K-720: 
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1-367 to K-720; 1-368 to K-720; F-369 to K-720; G-370 to K-720; 1-371 to 
K-720; V-372 to K-720; 1-373 to K-720; V-374 to K-720; G-375 to K-720; 
M-376 to K-720; F-377 to K-720; C-378 to K-720; A-379 to K-720; A-380 to 
K-720; F-381 to K-720; Y-382 to K-720; F-383 to K-720; K-384 to K-720; 
S-385 to K-720; K-386 to K-720; K-387 to K-720; Q-388 to K-720; A-389 to 
K-720; K-390 to K-720; Q-391 to K-720; 1-392 to K-720; Q-393 to K-720; 
E-394 to K-720; Q-395 to K-720; L-396 to K-720; K-397 to K-720; V-398 to 
K-720; P-399 to K-720; Q-400 to K-720; N-401 to K-720; G-402 to K-720; 
K-403 to K-720: S-404 to K-720; Y-405 to K-720; S-406 to K-720; L-407 to 
K-720; K-408 to K-720; A-409 to K-720; S-410 to K-720; S-41 1 to K-720: 
T-412 to K-720; M-4I3 to K-720; A-414 to K-720; K-415 to K-720; S-41 6 to 
K-720; E-417 to K-720; N-418 to K-720; L-419 to K-720; V-420 to K-720; 
K-421 to K-720; S-422 to K-720: H-423 to K-720; V-424 to K-720; Q-425 to 
K-720; L-426 to K-720; Q-427 to K-720; N-428 to K-720; Y-429 to K-720; 
S-430 to K-720; K-43 1 to K-720; V-432 to K-720; E-433 to K-720; R-434 to 
K-720; H-435 to K-720; P-436 to K-720; V-437 to K-720; T-438 to K-720; 
A-439 to K-720; L-440 to K-720; E-441 to K-720; K-442 to K-720; M-443 to 
K-720; M-444 to K-720; E-445 to K-720; S-446 to K-720; S-447 to K-720; 
F-448 to K-720; V-449 to K-720; G-450 to K-720; P-451 to K-720; Q-452 to 
K-720; Sr453 to K-720; F-454 to K-720; P-455 to K-720: E-456 to K-720; 
V-457 to K-720; P-458 to K-720; S-459 to K-720; P-460 to K-720; D-461 to 
K-720; R-462 to K-720; G-463 to K-720; S-464 to K-720; Q-465 to K-720; 
S-466 to K-720; V-467 to K-720; K-468 to K-720; H-469 to K-720; H-470 to 
K-720; R-47 1 to K-720; S-472 to K-720; L-473 to K-720; S-474 to K-720; ■ 
S-475 to K-720; C-476 to K-720; C-477 to K-720; S-478 to K-720: P-479 to 
K-720; G-480 to K-720; Q-481 to K-720; R-482 to K-720; S-483 to K-720; 
G-484 to K-720: M-485 to K-720; L-486 to K-720; H-487 to K-720; R-488 to 
K-720; N-489 to K-720; A-490 to K-720; F-491 to K-720; R-492 to K-720; 
R-493 to K-720; T-494 to K-720; P-495 to K-720; P-496 to K-720; S-497 to 
K-720: P-498 to K-720; R-499 to K-720; S-500 to K-720; R-501 to K-720; 
L-502 to K-720; G-503 to K-720; G-504 to K-720; 1-505 to K-720; V-506 to 
K-720; G-507 to K-720; P-508 to K-720; A-509 to K-720; Y-510 to K-720; 
Q-51 1 to K-720; Q-512 to K-720: L-513 to K-720; E-514 to K-720; E-515 to 
K-720; S-516 to K-720; R-517 to K-720: 1-518 to K-720; P-519 to K-720; 
D-520 to K-720; Q-521 to K-720; D-522 to K-720; T-523 to K-720; 1-524 to 
K-720; P-525 to K-720; C-526 to K-720; Q-527 to K-720; G-528 to K-720; 
1-529 to K-720: E-530 to K-720; V-531 to K-720; R-532 to K-720; K-533 to 
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K-720; T-534 to K-720; 1-535 to K-720: S-536 to K-720; H-537 to K-720; 
L-538 to K-720; P-539 to K-720; 1-540 to K-720; Q-541 to K-720; L-542 to 
K-720; W-543 to K-720; C-544 to K-720; V-545 to K-720; E-546 to K-720; 
R-547 to K-720; P-548 to K-720; L-549 to K-720: D-550 to K-720; L-551 to 
K-720; K-552 to K-720; Y-553 to K-720; S-554 to K-720; S-555 to K-720; 
S-556 to K-720; G-557 to K-720; L-558 to K-720; K-559 to K-720; T-560 to 
K-720; Q-561 to K-720; R-562 to K-720; N-563 to K-720; T-564 to K-720; 
S-565 to K-720; 1-566 to K-720; N-567 to K-720; M-568 to K-720; Q-569 to 
K-720; L-570 to K-720; P-571 to K-720; S-572 to K-720; R-573 to K-720: 
E-574 to K-720; T-575 to K-720; N-576 to K-720; P-577 to K-720; Y-578 to 
K-720; F-579 to K-720; N-580 to K-720: S-58 1 to K-720; L-582 to K-720: 
E-583 to K-720; Q-584 to K-720; K-585 to K-720; D-586 to K-720: L-587 to 
K-720; Vr588 to K-720; G-589 to K-720; Y-590 to K-720; S-591 to K-720; 
S-592 to K-720; T-593 to K-720; R-594 to K-720; A-595 to K-720; S-596 to 
K-720; S-597 to K-720; V-598 to K-720; P-599 to K-720; 1-600 to K-720; 
1-601 to K-720; P-602 to K-720: S-603 to K-720; V-604 to K-720; G-605 to 
K-720; L-606 to K-720: E-607 to K-720; E-608 to K-720: T-609 to K-720; 
C-610 to K-720; L-61 1 to K-720; Q-612 to K-720; M-613 to K-720; P-614 to 
K-720; G-615 to K-720; 1-616 to K-720; S-617 to K-720; E-618 to K-720; 
V-619 to K-720; K-620 to K-720; S-621 to K-720; 1-622 to K-720; K-623 to 
K-720; W-624 to K-720; C-625 to K-720; K-626 to K-720: N-627 to K-720; 
S-628 to K-720; Y-629 to K-720: S-630 to K-720: A-63 1 to K-720; D-632 to 
K-720; V-633 to K-720; V-634 to K-720; N-635 to K-720; V-636 to K-720: 
S-637 to K-720: 1-638 to K-720; P-639 to K-720; V-640 to K-720: S-641 to 
K-720; D-642 to K-720; C-643 to K-720; L-644 to K-720; 1-645 to K-720; 
A-646 to K-720; E-647 to K-720; Q-648 to K-720; Q-649 to K-720; E-650 to 
K-720; V-65 1 to K-720; K-652 to K-720; 1-653 to K-720: L-654 to K-720; 
L-655 to K-720; E-656 to K-720; T-657 to K-720; V-658 to K-720; Q-659 to 
K-720; E-660 to K-720; Q-661 to K-720; 1-662 to K-720; R-663 to K-720; 
1-664 to K-720; L-665 to K-720; T-666 to K-720: D-667 to K-720; A-668 to 
K-720; R-669 to K-720; R-670 to K-720; S-67 1 to K-720: E-672 to K-720; 
D-673 to K-720; Y-674 to K-720: E-675 to K-720; L-676 to K-720; A-677 to 
K-720: S-678 to K-720; V-679 to K-720; E-680 to K-720; T-681 to K-720; 
E-682 to K-720; D-683 to K-720: S-684 to K-720; A-685 to K-720; S-686 to 
K-720; E-687 to K-720; N-688 to K-720; T-689 to K-720: A-690 to K-720; 
F-691 to K-720: L-692 to K-720; P-693 to K-720; L-694 to K-720: S-695 to 
K-720; P-696 to K-720: T-697 to K-720; A-698 to K-720: K-699 to K-720: 
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S-700 to K-720; E-701 to K-720; R-702 to K-720; E-703 to K-720; A-704 to 
K-720; Q-705 to K-720; F-706 to K-720; V-707 to K-720; L-708 to K-720; 
R-709 to K-720; N-710 to K-720; E-71 1 to K-720; I-7J2 to K-720; Q-713 to 
K-720; R-714 to K-720; and D-715 to K-720 of the HLF sequence shown in 
SEQ ID NO:22. Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

Also as mentioned above, even if deletion of one or more amino acids 
from the C-terminus of a protein results in modification of loss of one or more 
biological functions of the protein, other biological activities may still be 
retained. Thus, the ability of the shortened HLF mutein to induce and/or bind to 
antibodies which recognize the complete or mature of the protein generally will 
be retained when Jess than the majority of the residues of the complete or mature 
protein are removed from the C-terminus. Whether a particular polypeptide 
lacking C-terminal residues of a complete protein retains such immunologic 
activities can readily be determined by routine methods described herein and 
otherwise known in the art. It is not unlikely that an HLF mutein with a large 
number of deleted C-terminal amino acid residues may retain some biological or 
immungenic activities. In fact, peptides composed of as few as six HLF amino 
acid residues may often evoke an immune response. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the carboxy terminus of the amino acid 
sequence of the HLF shown in SEQ ID NO:22, up to the alanine residue at 
position number 6, and polynucleotides encoding such polypeptides. In 
particular, the present invention provides polypeptides comprising the amino 
acid sequence of residues l-m" of SEQ ID NO:22, where m" is an integer in the 
range of 7-718, and 6 is the position of the first residue from the C-terminus of 
the complete HLF polypeptide believed to be required for at least immunogenic 
activity of the HLF protein. 

More in particular, the invention provides polynucleotides encoding 
polypeptides comprising, or alternatively consisting of, the amino acid sequence 
of residues M-l toT-719; M-l to L-718;M-1 to A-717;M-1 toS-716; M-l to 
D-715; M-l to R-714; M-l to Q-713; M-l to 1-712; M-l to E-71 1; M-l to 
N-710; M-l to R-709; M-l to L-708; M-l to V-707; M-l to F-706; M-l to 
Q-705; M-l to A-704; M-l to E-703; M-l to R-702; M-l to E-701; M-l to 
S-700; M-l to K-699; M-l to A-698; M-l to T-697; M-l to P-696; M-l to 
S-695; M-l to L-694; M-l to P-693; M-l to L-692; M-l to F-691; M-l to 
A-690; M-l to T-689; M-l toN-688; M-l to E-687; M-l to S-686; M-l to 



WO 98/57989 



PCT7US98/12403 



37 



A-685; M- 1 to S-684; M- 1 to D-683; M- 1 to E-682; M- 1 to T-68 1 ; M- 1 to 
E-680; M-l to V-679; M-l to S-678; M-l to A-677; M-l to L-676; M-l to 
E-675:M-1 to Y-674; M-l toD-673;M-l toE-672;M-l toS-671;M-l to 
R-670; M-l to R-669; M-l to A-668; M-l to D-667; M-l to T-666: M-l to 
5 L-665; M-l to 1-664; M-l to R-663; M-l to 1-662; M-l to Q-661; M-l to E-660: 

M-l toQ-659; M-l to V-658;M-1 toT-657;M-l to E-656; M-l toL-655; M-l 
to L-654; M-l to 1-653; M-l to K-652; M-l to V-651; M-l to E-650: M-l to 
Q-649; M-l to Q-648; M-l to E-647: M-l to A-646; M-l to 1-645; M-l to 
L-644; M-l to C-643; M-l to D-642; M-l to S-641; M- 1 to V-640; M-l to 

I o P-639; M- 1 to 1-638; M- 1 to S-637; M- 1 to V-636; M- 1 to N-635; M- 1 to 

V-634;M-1 to V-633; M-l toD-632;M-l toA-631;M-l to S-630; M-l to 
Y-629; M-l to S-628; M-l to N-627: M-l to K-626; M-l to C-625; M-l to 
W-624; M- 1 to K-623; M- 1 to 1-622; M- 1 to S-62 1 ; M- 1 to K-620: M- 1 to 
V-619; M-l toE-618;M-l to S-617;M-1 to 1-616; M-l toG-615; M-l to 

i- P-614; M-l to M-613; M-l toQ-612; M-l toL-61 1; M-l toC-610; M-l to 

T-609; M-l to E-608; M-l to E-607; M-l to L-606; M-l to G-605; M-l to 
Y-604; M-l to S-603; M-l to P-602; M-l to 1-601; M-l to 1-600; M-l to P-599; 
M l to V-598;M-1 toS-597;M-l to S-596; M-l toA-595;M-l to R-594; M-l 
u. T-593;M-l to S-592; M-l to S-591; M-l to Y-590: M-l to G-589; M-l to 

:o V-588; M-l to L-587; M-l to D-586; M-l to K-585; M-l to Q-584; M-l to 

E-583; M- 1 to L-582; M- 1 to S-58 1 ; M- 1 to N-580; M- 1 to F-579; M- 1 to 
Y-578; M- 1 to P-577: M- 1 to N-576; M- 1 to T-575; M- 1 to E-574; M- 1 to 
R-573;M-1 toS-572;M-l toP-571;M-l toL-570;M-l to Q-569; M-l to 
M-568; M-l to N-567; M-l to 1-566; M-l to S-565; M-l toT-564; M-l to 

25 N-563;M-1 toR-562;M-l toQ-561;M-l toT-560;M-l to K-559: M-l to 

L-558; M-l toG-557; M-l to S-556; M-l to S-555;M-1 toS-554;M-l to 
Y-553; M-l to K-552; M-l toL-551; M-l to D-550; M-l to L-549; M-l to 
P-548; M-l to R-547; M-l to E-546; M-l to V-545: M-l to C-544; M-l to 
W-543; M-l to L-542; M-l to Q-541; M-l to 1-540; M-l to P-539; M-l to 

30 L-538;M-1 toH-537;M-l toS-536;M-l to 1-535; M-l toT-534;M-l to 

K-533; M- 1 to R-532; M- 1 to V-53 1; M-l to E-530; M- 1 to 1-529; M- 1 to 
G-528; M-l to Q-527; M-l to C-526; M-l to P-525; M-l to 1-524; M-l to 
T-523; M- 1 to D-522; M- 1 to Q-52 1 ; M-l to D-520; M- 1 to P-5 1 9: M- 1 to 
1-518; M-l toR-517;M-l toS-516: M-l toE-515;M-l to E-514; M-l to 

35 L-5I3;M-1 toQ-512;M-l toQ-511;M-l to Y-510;M-1 toA-509;M-l to 

P-508; M-l to G-507; M-l to V-506; M-l to 1-505; M-l to G-504; M-l to 
G-503; M-l to L-502; M-l to R-501; M-l to S-500; M-l to R-499: M-l to 
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P-498: M-l to S-497; M-l to P-496; M-l to P-495; M-l to T-494; M-l to 
R-493; M-l to R-492; M-l to F-491; M-l to A-490; M-l to N-489; M-l to 
R-488; M-l to H-487; M-l to L-486; M-l to M-485; M-l toG-484; M-l to 
S-483: M-l to R-482; M-l to Q-481; M-l to G-480; M-l to P-479; M-l to 
S-478; M-l to C-477; M-l to C-476; M-l to S-475: M-l to S-474; M-l to 
L-473;M-1 to S-472; M-l toR-471;M-l to H-470; M-l toH-469;M-l to 
K-468; M-l to V-467; M-l to S-466; M-l to Q-465; M-l to S-464; M-l to 
G-463: M-l to R-462; M-l to D-461; M-l to P-460; M-l to S-459; M-l to 
P-458; M-l to V-457; M-l to E-456; M-l to P-455; M-l to F-454; M-l to 
S-453; M-l to Q-452; M-l to P-451; M-l to G-450; M-l to V-449; M-l to 
F-448; M-l to S-447; M-l to S-446; M-l to E-445; M-l to M-444; M-l to 
M-443;M-1 to K-442; M-l toE-441;M-l to L-440; M-l to A -439; M-l to 
T-438; M-l to V-437; M-l to P-436; M-l to H-435; M-l to R-434; M-l to 
E-433; M-l to V-432; M-l to K-431; M-l to S-430; M-l to Y-429; M-l to 
N-428; M-l to Q-427; M-l to L-426; M-l to Q-425; M-l to V-424; M-l to 
H-423;M-1 toS-422;M-l toK-421;M-l to V-420:M-1 toL-419: M-l to 
N-418;M-1 toE-417;M-l toS-416; M-l toK-415;M-l toA-414;M-l to 
M-413;M-1 toT-412;M-J toS-411:M-l toS-410: M-l to A-409; M-l to 
K-408; M-l to L-407; M-l to S-406; M-l to Y-405; M-l to S-404; M-l to 
K-403; M-l to G-402; M-l to N-401; M-l to Q-400; M-l to P-399; M-l to 
V-398; M-l to K-397; M-l to L-396; M-l to Q-395; M-l to E-394; M-l to 
Q-393; M-l to 1-392; M-l to Q-391; M-l to K-390; M-l to A-389; M-l to 
Q-388; M-l to K-387: M-l to K-386; M-l to S-385; M-l to K-384; M-l to 
F-383;M-1 to Y-382; M-l toF-381;M-l to A-380: M-l to A-379; M-l to 
C-378; M-l to F-377; M-l to M-376; M-l to G-375; M-l to V-374; M-l to 
1-373; M-l to V-372; M-l to 1-371; M-l to G-370; M-l toF-369;M-l to 1-368; 
M-l to 1-367; M-l to C-366; M-l to S-365; M-l to 1-364; M-l to S-363; M-l to 
L-362; M-l to V-361; M-l to Q-360; M-l to R-359; M-l to Q-358; M-l to 
Y-357; M-l to V-356; M-l to E-355; M-l to E-354; M-l to S-353; M-l to 
E-352;M-1 toM-351;M-l toF-350; M-l to E-349; M-l to 1-348: M-l to 
G-347;M-1 to L-346; M-l toH-345;M-l toD-344;M-l toT-343;M-l to 
P-342; M-l to D-341; M-l to S-340; M-l to L-339; M-l to 1-338; M-l to 
S-337; M-l to D-336; M-l to T-335; M-l to K-334; M-l to P-333; M-l to 
L-332; M-l to F-331; M-l to Q-330; M-l to D-329; M-l to C-328; M-l to 
R-327; M-l to V-326; M-l to G-325; M-l to Q-324; M-l to Y-323; M-l to 
G-322; M- 1 to E-32 1 ; M- 1 to K-320; M- 1 to C-3 1 9; M- 1 to R-3 1 8; M- 1 to 
C-317.M-1 to H-316;M-1 to K-315:M-1 toH-314; M-l toS-313;M-l to 
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G-312;M-1 toT-311;M-l toL-310; M-l toT-309;M-l loE-308:M-l to 
1-307; M-l to V-306; M-l to F-305; M-l to C-304; M-l to E-303; M-l to 
G-302; M-l to D-301; M-l to N-300; M-l to L-299; M-l to C-298; M-l to 
Y-297; M-l to A-296; M-l to L-295: M-l to D-294; M-l to K-293; M-l to 
5 D-292;,M-1 to R-291; M-l to C-290; M-l to P-289; M-l to K-288; M-l to 

F-287; M-I to H-286; M-l to E-285; M-l to S-284; M-l to R-283; M-l to 
E-282; M-l toT-281; M-l to S-280: M-l to Y-279: M-l to T-278; M-l to 
T-277; M-l to T-276; M-l to H-275; M-l to F-274: M-l to K-273; M-l to 
P-272; M- 1 to S-27 1 ; M- 1 to T-270; M- 1 to S-269; M- 1 to T-268: M- 1 to 

10 E-267;M-1 toP-266;M-l toT-265;M-l toT-264;M-l toT-263;M-l to 

T-262; M- 1 to S-26 1 ; M- 1 to S-260; M- 1 to S-259; M- 1 to S-258; M- 1 to 
S-257; M-l to S-256; M-l toS-255:M-l toS-254;M-l toS-253;M-l to 
S-252; M-l to A-251; M-l to A-250: M-l to D-249; M-l to Q-248; M-l to 
F-247;M-1 to P-246;M-1 toS-245:M-l toL-244;M-l toT-243;M-l to 

1 5 W-242: M- 1 to S-24 1 ; M- 1 to P-240; M- 1 to T-239: M- 1 to S-238; M- 1 to 

D-237; M-l to H-236; M-l to L-235; M-l to Y-234; M-l to S-233; M-l to 
S-232; M- 1 to T-23 1 : M- 1 to A-230: M- 1 to Y-229; M- 1 to A-228; M- 1 to 
A-227:M-1 toT-226; M-l toP-225;M-l to W-224; M-l toS-223;M-l to 
P-222; M- 1 to M-22 1; M-l to A-220; M- 1 to Q-2 19; M-l to T-2 1 8; M- 1 to 

20 S-217;M-1 toP-216;M-l toT-215:M-l toG-214;M-l toP-213; M-l to 

V-212;M-1 to P-211;M-1 toP-210;M-l to R-209;M-1 toS-208;M-l to 
G-207;M-1 toL-206;M-l toT-205;M-l toS-204;M-l toS-203;M-l to 
S-202;M-1 toF-201;M-l to F-200: M-l to P-199;M-1 to A-198;M-1 to 
T-197;M-1 toT-196; M-l loS-195:M-l toP-194;M-l toV-193;M-l to 

25 T-192;M-1 to A-191;M-1 toP-190;M-l toA-189:M-l to A-188;M-1 to 

T-187; M-l toN-186; M-l toR-185; M-l to A-184; M-l toT-183; M-l to 
T-I82;M-1 to S-181; M-l toR-180; M-l to P-179; M-l toS-178;M-l to 
A-177;M-1 toR-176; M-l to 1-175; M-l to P-174;M-1 to V-173: M-l to 
R-172;M-1 to H-171;M-1 toG-170; M-l to P-169; M-l toF-168;M-l to 

30 R-167;M-1 toT-166;M-l toP-165:M-l toA-164;M-l toR-163;M-l to 

T-162;M-1 toI-161;M-l toT-160; M-l toT-159; M-l toL-158;M-l to 
R-157;M-1 toT-156; M-l toS-155:M-l to 1-154; M-l to R-153:M-1 to 
N-152; M-l to P-151; M-l toT-150: M-l to R-149; M-l to S-148; M-l to 
S-147;M-1 to A-146;M-1 toA-145: M-l toG-144:M-l toG-143;M-l to 

35 A-142; M-l to S-141; M-l toP-140;M-l toT-139;M-l toA-138;M-l to 

P-137;M-1 toS-136;M-l toT-135:M-l toT-134;M-l toS-133;M-l to 
T-132;M-1 toT-131;M-l to T- 130: M-l to T- 129; M-l loT-128;M-l to 
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T-127;M-1 toE-126;M-l toM-125;M-l to A-124;M-1 toK-123:M-l to 
P-122;M-1 toF-121;M-l toS-120; M-l toS-H9;M-l toP-118;M-l to 
K-117; M-l toS-116; M-l toL-115;M-l to F-114;M-1 toF-113;M-l to 
P-112;M-J toD-Hl;M-l toQ-110; M-l toG-109; M-l toM-108;M-l to 
G-107;M-1 toK-106; M-l toS-105;M-l toD-104: M-l to V-103; M-l to 
L-102: M-l toD-101;M-l toT-100; M-l to P-99; M-l to V-98; M-l to Y-97; 
M-l to E-96; M-l to K-95; M-l to V-94; M-l to S-93; M-l to G-92; M-l to 
V-91;M-1 toV-90:M-l to 1-89; M-l to W-88; M- 1 to K-87;M-1 toL-86; M-l 
toL-85;M-l to M-84; M-l toL-83;M-l to S-82; M-l toL-81;M-l to G-80; 
M-l toL-79; M-l toG-78;M-l to 1-77; M-l to F-76; M-l toG-75;M-l to 1-74; 
M- 1 to F-73; M- 1 to L-72; M- 1 to P-7 1 ; M- 1 to V-70; M- 1 to V-69; M- 1 to 
C-68; M-l toL-67;M-l toW-66;M-l toT-65;M-l toQ-64;M-l toQ-63;M-l 
lo R-62; M-l to N-61: M-l to W-60; M-l to V-59; M-l to 1-58; M-l to C-57; 
M-l to D-56; M-l to S-55;M-1 toC-54;M-l toR-53;M-l toL-52;M-l to 
C-51:M-1 toR-50;M-l to P-49; M-l toP-48:M-l to E-47.M-1 to A-46; M-l 
|.<A-45:M-1 toG-44; M-l toE-43;M-l to G-42; M-l toG-41;M-l to G-40; 
M-l toD-39; M-l to P-38;M-1 toG-37;M-l toG-36;M-l toG-35;M-l to 
A-34;M-1 toA-33;M-l toA-32;M-l toA-31;M-l to A-30; M-l to A-29; M-l 
io A-28; M-l to A-27; M-l to A-26; M-l toT-25;M-l toG-24;M-l to E-23 ; 
M-l toE-22;M-l toA-21;M-l toS-20;M-l to A-19;M-1 to A-18;M-1 to 
A-17; M-l to A-16; M-l toS-15; M-l to A-14; M-l to A-13; M-l toG-12; M-l 
ioP-)l;M-l toP-10; M-l to S-9; M-l toA-8;M-l toA-7;M-l toA-6ofthc 
I ILF sequence shown in SEQ ID NO:22. Polynucleotides encoding these 
polypeptides also are provided. 

The invention also provides polypeptides having one or more amino 
acids deleted from both the amino and the carboxyl termini of an HLF 
polypeptide, which may be described generally as having residues n"-m" of 
SEQ ID NO:22, where n" and m" are integers as described above. 

Other Mutants 

In addition to terminal deletion forms of the protein discussed above, it 
also will be recognized by one of ordinary skill in the art that some amino acid 
sequences of the HLF polypeptide can be varied without significant effect of the 
structure or function of the protein. If such differences in sequence are 
contemplated, it should be remembered that there will be critical areas on the 
protein which determine activity. 
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Thus, the invention further includes variations of the HLF polypeptide 
which show substantial HLF polypeptide activity or which include regions of 
HLF protein such as the protein portions discussed below. Such mutants 
include deletions, insertions, inversions, repeats, and type substitutions selected 
5 according to general rules known in the art so as have little effect on activity. 

For example, guidance concerning how to make phenotypically silent amino 
acid substitutions is provided by Bowie and colleagues (Science 
247: 1306-1310; 1990), wherein the authors indicate that there are two main 
approaches for studying the tolerance of an amino acid sequence to change. The 
10 first method relics on the process of evolution, in which mutations are either 

accepted or rejected by natural selection. The second approach uses genetic 
engineering to introduce amino acid changes at specific positions of a cloned 
gene and selections or screens to identify sequences that maintain functionality. 
As the authors state, these studies have revealed that proteins are 

15 surprisingly tolerant of amino acid substitutions. The authors further indicate 

which amino acid changes are likely to be permissive at a certain position of the 
protein. For example, most buried amino acid residues require nonpolar side 
chains, whereas few features of surface side chains are generally conserved. 
Other such phenotypically silent substitutions are described in Bowie, J. U. et 

20 aL supra, and the references cited therein. Typically seen as conservative 

substitutions are the replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, 
exchange of the acidic residues Asp and Glu, substitution between the amide 
residues Asn and Gin, exchange of the basic residues Lys and Arg and 

25 replacements among the aromatic residues Phc, Tyr. 

Thus, the fragment, derivative or analog of the polypeptide of SEQ ID 
NO:2, or that encoded by the deposited cDNA, may be (i) one in which one or 
more of the amino acid residues are substituted with a conserved or non- 
conserved amino acid residue (preferably a conserved amino acid residue) and 

30 such substituted amino acid residue may or may not be one encoded by the 

genetic code, (ii) one in which one or more of the amino acid residues includes a 
substituent group, (iii) one in which the extracellular domain of the HLF 
polypeptide is fused with another compound, such as a compound to increase 
the half-life of the polypeptide (for example, polyethylene glycol), (iv) one in 

35 which the EGF domain of the HLF polypeptide is fused with another 

compound, such as a compound to increase the half-life of the polypeptide (for 
example, polyethylene glycol), (v) one in which the additional amino acids arc 
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fused to the extracellular form of the polypeptide, such as an IgG Fc fusion 
region peptide or leader or secretory sequence or a sequence which is employed 
for purification of the above form of the polypeptide or a proprotein sequence, 
or (vi) one in which the additional amino acids are fused to the EGF form of the 
polypeptide, such as an IgG Fc fusion region peptide or leader or secretory 
sequence or a sequence which is employed for purification of the above form of 
the polypeptide or a proprotein sequence. Such fragments, derivatives and 
analogs are deemed to be within the scope of those skilled in the art from the 
teachings herein 

Thus, the HLF of the present invention may include one or more amino 
acid substitutions, deletions or additions, either from natural mutations or 
human manipulation. As indicated, changes are preferably of a minor nature, 
such as conservative amino acid substitutions that do not significantly affect the 
folding or activity of the protein (see Table 1). 



TABLE 1. Conservative Amino Acid Substitutions. 



Aromatic 


Phenylalanine 

Tryptophan 

Tyrosine 


Hydrophobic 


Leucine 

Isoleucine 

Valine 


Polar 


Glutamine 
Asparagine 


Basic 


Arginine 

Lysine 

Histidine 


Acidic 


Aspartic Acid 
Glutamic Acid 


Small 


Alanine 

Serine 

Threonine 

Methionine 

Glycine 



In specific embodiments, the number of substitutions, deletions or 
additions in the amino acid sequence of Figure 1 A and/or any of the polypeptide 
fragments described herein is 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 



18, 17, 16, 15, 14, 13, 12 ,1 1, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 or 30-20, 20-10, 
20-15, 15-10, 10-5 or 1-5. 

Amino acids in the HLF protein of the present invention that are 
essential for function can be identified by methods known in the art, such as 
site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham and 
Wells, Science 244:1081-1085 (1989)). The latter procedure introduces single 
alanine mutations at every residue in the molecule. The resulting mutant 
molecules are then tested for biological activity such as receptor binding or in 
vitro or in vitro proliferative activity. 

Of special interest are substitutions of charged amino acids with other 
charged or neutral amino acids which may produce proteins with highly 
desirable improved characteristics, such as less aggregation. Aggregation may 
not only reduce activity but also be problematic when preparing pharmaceutical 
formulations, because aggregates can be immunogenic (Pinckard et aL Clin. 
Exp. Immunol. 2:33 1 -340 ( 1967); Robbins et al.. Diabetes 36: 838-845 (1987); 
Cleland et al., Crit. Rev. Therapeutic Drug Carrier Systems 70:307-377 (1993). 

Replacement of amino acids can also change the selectivity of the 
binding of a ligand to cell surface receptors. For example, Ostade et al.. Nature 
567:266-268 (1993) describes certain mutations resulting in selective binding of 

TNF-a to only one of the two known types of TNF receptors. Sites that are 

critical for ligand-receptor binding can also be determined by structural analysis 
such as crystallization, nuclear magnetic resonance or photoaffinity labeling 
(Smith et aL, J. Mol. Biol 224:899-904 (1992) and dc Vos et al Science 
255:306-312 (1992)). 

Since HLF is a member of the EGF-related protein family, to modulate 
rather than completely eliminate biological activities of HLF, preferably 
mutations are made in sequences encoding amino acids in the HLF conserved 
domain, i.e., in amino acid positions about 26 to about 93 of SEQ ID NO:2, 
more preferably in residues within this region which are not conserved in all 
members of the EGF family. Also forming part of the present invention are 
isolated polynucleotides comprising nucleic acid sequences which encode the 
above HLF mutants. 

The polypeptides of the present invention are preferably provided in an 
isolated form, and preferably are substantially purified. A recombinantly 
produced version of the HLF polypeptide can be substantially purified by the 
one-step method described by Smith and Johnson (Gene 67:31-40; 1988). 
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Polypeptides of the invention also can be purified from natural or recombinant 
sources using anti-HLF antibodies of the invention in methods which are well 
known in the art of protein purification. 

The invention further provides an isolated HLF polypeptide comprising 
an amino acid sequence selected from the group consisting of: (a) the amino 
acid sequence of the HLF polypeptide having the complete amino acid sequence 
shown in SEQ ID NO:2 (i.e., positions 1 to 157 of SEQ ID NO:2) or the 
complete amino acid sequence encoded by the cDNA clone contained in the 
ATCC Deposit No. 209123; (b) the amino acid sequence of the predicted 
extracellular domain of the HLF polypeptide having the amino acid sequence 
shown in SEQ ID NO:2 (i.e., positions 1 to 101 of SEQ ID NO:2) or as 
encoded by the cDNA clone contained in the ATCC Deposit No. 209123; (c) 
the amino acid sequence of the predicted transmembrane domain of the HLF 
polypeptide having the amino acid sequence shown in SEQ ID NO:2 (i.e., 
positions 102 to 121 of SEQ ID NO:2) or as encoded by the cDNA clone 
contained in the ATCC Deposit No. 209123; (d) the amino acid sequence of the 
predicted intracellular domain of the HLF polypeptide having the amino acid 
sequence shown in SEQ ID NO:2 (i.e., positions 122 to 157 of SEQ ID NO:2) 
or as encoded by the cDNA clone contained in the ATCC Deposit No. 209123; 
and (e) the amino acid sequence of a soluble HLF polypeptide having the 
extracellular and intracellular domains but lacking the transmembrane domain. 
The polypeptides of the present invention also include polypeptides having an 
amino acid sequence at least 80% identical, more preferably at least 90% 
identical, and still more preferably 95%, 96%, 97%, 98% or 99% identical to 
(or at most 20% different, more preferably at most 10% different, and still more 
preferably 5%, 4%, 3%, 2% or 1% different from) those described in (a), (b), 
(c), (d), or (e) above, as well as polypeptides having an amino acid sequence 
with at least 90% similarity, and more preferably at least 95% similarity, to 
those above. 

Further polypeptides of the present invention include polypeptides 
which have at least 90% similarity, more preferably at least 95% similarity, and 
still more preferably at least 96%, 97%, 98% or 99% similarity to those 
described above. The polypeptides of the invention also comprise those which 
are at least 80% identical, more preferably at least 90% or 95% identical, still 
more preferably at least 96%, 97%, 98% or 99% identical to (or at most 20% 
different, more preferably at most 10% or 5% different, still more preferably at 
most 4%. 3%. 2% or 1% different from) the polypeptide encoded by the 
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deposited cDNA or to the polypeptide of SEQ ID NO:2, and also include 
portions of such polypeptides with at least 30 amino acids and more preferably 
at least 50 amino acids. 

By "% similarity*' for two polypeptides is intended a similarity score 
produced by comparing the amino acid sequences of the two polypeptides using 
the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for 
Unix, Genetics Computer Group, University Research Park, 575 Science 
Drive, Madison, WI 5371 1) and the default settings for determining similarity. 
Bestfit uses the local homology algorithm of Smith and Waterman (Adv. Appl. 
Math. 2:482-489; 1981) to find the best segment of similarity between two 
sequences. 

By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to a reference amino acid sequence of an HLF polypeptide is 
intended that the amino acid sequence of the polypeptide is identical to the 
reference sequence except that the polypeptide sequence may include up to five 
amino acid alterations per each 100 amino acids of the reference amino acid of 
the HLF polypeptide. In other words, to obtain a polypeptide having an amino 
acid sequence at least 95% identical to a reference amino acid sequence, up to 
5% of the amino acid residues in the reference sequence may be deleted or 
substituted with another amino acid, or a number of amino acids up to 5% of the 
total amino acid residues in the reference sequence may be inserted into the 
reference sequence. These alterations of the reference sequence may occur at 
the amino or carboxy terminal positions of the reference amino acid sequence or 
anywhere between those terminal positions, interspersed either individually 
among residues in the reference sequence or in one or more contiguous groups 
within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 
95%, 96%, 97%, 98% or 99% identical to (or at most 10%, 5%, 4%, 3%, 2% 
or 1 % different from), for instance, the amino acid sequence shown in SEQ ID 
NO:2 or to the amino acid sequence encoded by deposited cDNA clone can be 
determined conventionally using known computer programs such the Bestfit 
program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive, Madison, WI 
5371 1 ). When using Bestfit or any other sequence alignment program to 
determine whether a particular sequence is, for instance, 95% identical to (or 
5% different from) a reference sequence according to the present invention, the 
parameters are set, of course, such that the percentage of identity is calculated 
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over the full length of the reference amino acid sequence and that gaps in 
homology of up to 5% of the total number of amino acid residues in the 
reference sequence are allowed. 

By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to (or 5% different from) a query amino acid sequence of the 
present invention, it is intended that the amino acid sequence of the subject 
polypeptide is identical to the query sequence except that the subject polypeptide 
sequence may include up to five amino acid alterations per each 100 amino acids 
of the query amino acid sequence. In other words, to obtain a polypeptide 
having an amino acid sequence at least 95% identical to a query amino acid 
sequence, up to 5% of the amino acid residues in the subject sequence may be 
inserted, deleted, (indels) or substituted with another amino acid. These 
alterations of the reference sequence may occur at the amino or carboxy terminal 
positions of the reference amino acid sequence or anywhere between those 
terminal positions, interspersed either individually among residues in the 
reference sequence or in one or more contiguous groups within the reference 
sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 
95%, 96%, 97%, 98% or 99% identical to (or at most 10%, 5%, 4%, 3%, 2% 
or 1% different from) for instance, the amino acid sequences shown in SEQ ID 
NO: 2 or to the amino acid sequence encoded by deposited DNA clone can be 
determined conventionally using known computer programs. A preferred 
method for determing the best overall match between a query sequence (a 
sequence of the present invention) and a subject sequence, also referred to as a 
global sequence alignment, can be determined using the FASTDB computer 
program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 
6:237-245). In a sequence alignment the query and subject sequences are either 
both nucleotide sequences or both amino acid sequences. The result of said 
global sequence alignment is in percent identity. Preferred parameters used in a 
FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch 
Penalty=L Joining Penalty=20, Randomization Group Length=0, Cutoff 
Score=I, Window Size=sequcnce length, Gap Penalty=5, Gap Size 
Penalty=0.05, Window Size=500 or the length of the subject amino acid 
sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or 
C-terminal deletions, not because of internal deletions, a manual correction must 
be made to the results. This is becuase the FASTDB program does not account 



WO 98/57989 



PCT/US98/12403 



for N- and C-terminal truncations of the subject sequence when calculating 
global percent identity. For subject sequences truncated at the N- and C- 
termini, relative to the the query sequence, the percent identity is corrected by 
calculating the number of residues of the query sequence that are N- and C- 
5 terminal of the subject sequence, which are not matched/aligned with a 

corresponding subject residue, as a percent of the total bases of the query 
sequence. Whether a residue is matched/aligned is determined by results of the 
FASTDB sequence alignment. This percentage is then subtracted from the 
percent identity, calculated by the above FASTDB program using the specified 

10 parameters, to arrive at a final percent identity score. This final percent identity 

score is what is used for the purposes of the present invention. Only residues to 
the N- and C-terrnini of the subject sequence, which are not matched/aligned 
with the query sequence, are considered for the purposes of manually adjusting 
the percent identity score. That is, only query residue positions outside the 

15 farthest N- and C-terminal residues of the subject sequence. 

For example, a 90 amino acid residue subject sequence is aligned with a 
100 residue query sequence to determine percent identity. The deletion occurs 
at the N-terminus of the subject sequence and therefore, the FASTDB alignment 
does not show a matching/alignment of the first 10 residues at the N-terminus. 

20 The 10 unpaired residues represent 10% of the sequence (number of residues at 

the N- and C- termini not matched/total number of residues in the query 
sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 residues were perfectly matched the 
final percent identity would be 90%. In another example, a 90 residue subject 

25 sequence is compared with a 100 residue query sequence. This time the 

deletions are internal deletions so there are no residues at the N- or C-termini of 
the subject sequence which are not matched/aligned with the query. In this case 
the percent identity calculated by FASTDB is not manually corrected. Once 
again, only residue positions outside the N- and C-terminal ends of the subject 

30 sequence, as displayed in the FASTDB alignment, which are not 

matched/aligned with the query sequnce are manually corrected for. No other 
manual corrections are to made for the purposes of the present invention. 

The polypeptide of the present invention could be used as a molecular 
weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns - ^ >a *a 

35 using methods well known to those of skill in the art. 

As described in detail below, the polypeptides of the present invention 
can also be used to raise polyclonal and monoclonal antibodies, which are 
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useful in assays for detecting HLF protein expression as described below or as 
agonists and antagonists capable of enhancing or inhibiting HLF protein 
function. Further, such polypeptides can be used in the yeast two-hybrid 
system to "capture" HLF protein binding proteins which are also candidate 
agonists and antagonists according to the present invention. The yeast two 
hybrid system is described in Fields and Song, Nature 340:245-246 (1989). 

Epitope- Bearing Portions 

In another aspect, the invention provides a peptide or polypeptide 
comprising an epitope-bearing portion of a polypeptide of the invention. The 
epitope of this polypeptide portion is an immunogenic or antigenic epitope of a 
polypeptide of the invention. An "immunogenic epitope" is defined as a part of 
a protein that elicits an antibody response when the whole protein is the 
immunogen. On the other hand, a region of a protein molecule to which an 
antibody can bind is defined as an "antigenic epitope." The number of 
immunogenic epitopes of a protein generally is less than the number of antigenic 
epitopes (Geysen et aL, Proc. Natl. Acad. Sci. USA 81:3998-4002; 1983). 

As to the selection of peptides or polypeptides bearing an antigenic 
epitope (i.e., that contain a region of a protein molecule to which an antibody 
can bind), it is well known in that art that relatively short synthetic peptides that 
mimic pan of a protein sequence are routinely capable of eliciting an antiserum 
that reacts with the partially mimicked protein (Sutcliffe, J. G., et aL, Science 
219:660-666; 1983). Peptides capable of eliciting protein-reactive sera are 
frequently represented in the primary sequence of a protein, can be characterized 
by a set of simple chemical rules, and are confined neither to immunodominant 
regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the 
invention are therefore useful to raise antibodies, including monoclonal 
antibodies, that bind specifically to a polypeptide of the invention. See, for 
instance, Wilson et aL Cell 57:767-778 (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention 
preferably contain a sequence of at least seven, more preferably at least nine 
and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non-limiting 
examples of antigenic polypeptides or peptides that can be used to generate 
HLF-specific antibodies include: a polypeptide comprising amino acid residues 
from about Ser-1 to about Thr-8, about Thr-9 to about Lys-18, about Thr-23 to 



WO 98/57989 



49 



PCT/US98/ 12403 



about His-31, about Phe-32 to about Leu-40, about Cys-43 to about Val-51, 
about Thr-56 to aboutTyr-68, about Gln-75 to about Leu-84, about Tyr-126 to 
about Ala- 135, about Ser-137 to about Leu- 146, and about Ser-148 to about 
Lys-157. These polypeptide fragments have been determined to bear antigenic 
5 epitopes of the HLF protein by the analysis of the Jameson-Wolf antigenic 

index, as shown in Figure 3, above. 

The epitope-bearing peptides and polypeptides of the invention may be 
produced by any conventional means. See, e.g., Houghten, R. A. (1985) 
"General method for the rapid solid-phase synthesis of large numbers of 
io peptides: specificity of antigen-antibody interaction at the level of individual 

amino acids" Proc. Natl. Acad. ScL USA £2:5131-5135; this "Simultaneous 
Multiple Peptide Synthesis (SMPS) M process is further described in U.S. Patent 
No. 4.63 1 .2 1 1 to Houghten et al. ( 1 986). 

Epitope-bearing peptides and polypeptides of the invention are used to 
: * induce antibodies according to methods well known in the art. See, for 

instance, Sutcliffe et al., supra; Wilson et al., supra; Chow, M. et al., Proc. 
\atl. Acad. Sci. USA 52:910-914; and Bittle, F. J. et aL J. Gen. Virol. 
66:2347-2354 ( 1 985). Immunogenic epitope-bearing peptides of the invention, 
i.e.. those parts of a protein that elicit an antibody response when the whole 

-° protein is the immunogen, are identified according to methods known in the art. 

Sec, for instance, Geysen et al., supra. Further still, U.S. Patent No. 
5.194,392 to Geysen (1990) describes a general method of detecting or 
determining the sequence of monomers (amino acids or other compounds) 
which is a topological equivalent of the epitope (i.e., a "mimotope") which is 

2 - s complementary to a particular paratope (antigen binding site) of an antibody of 

interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) 
describes a method of detecting or determining a sequence of monomers which 
is a topographical equivalent of a ligand which is complementary to the ligand 
binding site of a particular receptor of interest. Similarly, U.S. Patent No. 

30 5,480,971 to Houghten, R. A. et al. (1996) on Peralkylated Oligopeptide 

Mixtures discloses linear Cl-C7-alkyl peralkylated oligopeptides and sets and 
libraries of such peptides, as well as methods for using such oligopeptide sets 
and libraries for determining the sequence of a peralkylated oligopeptide that 
preferentially binds to an acceptor molecule of interest. Thus, non-peptide 

35 analogs of the epitope-bearing peptides of the invention also can be made 

routinely by these methods. 
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Fusion Proteins 

As one of skill in the art will appreciate. HLF polypeptides of the 
present invention and the epitope-bearing fragments thereof described above can 
be combined with pans of the constant domain of immunoglobulins (IgG), 
resulting in chimeric polypeptides. These fusion proteins facilitate purification 
and show an increased half-life in vivo. This has been shown, e.g., for 
chimeric proteins consisting of the first two domains of the human 
CD4-polypeptide and various domains of the constant regions of the heavy or 
light chains of mammalian immunoglobulins (EP A 394,827; Traunecker et ai, 
Nature 331:84-86; 1988). Fusion proteins that have a disulfide-linked dimeric 
structure due to the IgG part can also be more efficient in binding and 
neutralizing other molecules than the monomeric HLF protein or protein 
fragment alone (Fountoulakis, et al % J. Biochem. 270:3958-3964; 1995). 

Furthermore, HLF polypeptides of interest of the present invention, for 
example the extracellular EGF-like domain shown in Figure 1 A, can be 
combined with a recombinant toxin. Such a fusion polypeptide can be used to 
target the toxin, for example Pseudomonas exotoxin A, to a tumor through the 
efficient binding of the extracellular or smaller soluble domains of the HLF 
molecule of the present invention. In fact, Jeschke and colleagues {Int. J. 
Cancer 60:730-739; 1995) and Fiddes and coworkers (Cell Growth Differ, 
6: 1567-1577; 1995) have demonstrated that heregulin-toxin fusion proteins can 
be utilized in such a fashion. 

Antibodies 

HLF-protein specific antibodies for use in the present invention can be 
raised against the intact HLF protein or an antigenic polypeptide fragment 
thereof, which may be presented together with a carrier protein, such as an 
albumin, to an animal system (such as rabbit or mouse) or, if it is long enough 
(at least about 25 amino acids), without a carrier. 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" 
(Mab) is meant to include intact molecules as well as antibody fragments (such 
as. for example, Fab and F(ab*)2 fragments) which are capable of specifically 
binding to HLF protein. Fab and F(ab')2 fragments lack the Fc fragment of 
intact antibody, clear more rapidly from the circulation, and may have less 
non-specific tissue binding of an intact antibody (Wahl et aL, J. Nucl Med. 
24:316-325 (1983)). Thus, these fragments are preferred. 
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The antibodies of the present invention may be prepared by any of a 
variety of methods. For example, cells expressing the HLF protein or an 
antigenic fragment thereof can be administered to an animal in order to induce 
the production of sera containing polyclonal antibodies. In a preferred method, 
5 a preparation of HLF protein is prepared and purified to render it substantially 

free of natural contaminants. Such a preparation is then introduced into an 
animal in order to produce polyclonal antisera of greater specific activity. 

In the most preferred method, the antibodies of the present invention are 
monoclonal antibodies (or HLF protein binding fragments thereof). Such 

10 monoclonal antibodies can be prepared using hybridoma technology (Kohler et 

aL Nature 256:495 (1975); Kohler et ai, Eur. J. Immunol. 6:51 1 (1976); 
Kohler et al., Eur. J. Immunol. 6:292 (1976); Hammerling et aL, in: 
Monoclonal Antibodies and T-Cell Hyhridomas, Elsevier, N.Y., ( 1 98 1 ) pp. 
563-681 ). In general, such procedures involve immunizing an animal 

15 (preferably a mouse) with a HLF protein antigen or, more preferably, with a 

HLF protein-expressing cell. Suitable cells can be recognized by their capacity 
to bind anti-HLF protein antibody. Such cells may be cultured in any suitable 
tissue culture medium; however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented with 10% fetal bovine serum 

20 (inactivated at about 56° C), and supplemented with about 10 g/1 of nonessential 

amino acids, about 1,000 U/ml of penicillin, and about 100 jig/ml of 
streptomycin. The splenocytes of such mice are extracted and fused with a 
suitable myeloma cell line. Any suitable myeloma cell line may be employed in 
accordance with the present invention; however, it is preferable to employ the 

25 parent myeloma cell line (SP20), available from the American Type Culture 

Collection, Rockville, Maryland. After fusion, the resulting hybridoma cells are 
selectively maintained in HAT medium, and then cloned by limiting dilution as 
described by Wands et al. (Gastroenterology 80:225-232 (1981)). The 
hybridoma cells obtained through such a selection are then assayed to identify 

30 clones which secrete antibodies capable of binding the HLF protein antigen. 

Alternatively, additional antibodies capable of binding to the HLF 
protein antigen may be produced in a two-step procedure through the use of 
anti-idiotypic antibodies. Such a method makes use of the fact that antibodies 
are themselves antigens, and that, therefore, it is possible to obtain an antibody 

35 which binds to a second antibody. In accordance with this method, 

HLF-protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma 
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cells, and the hybridoma cells are screened to identify clones which produce an 
antibody whose ability to bind to the HLF protein-specific antibody can be 
blocked by the HLF protein antigen. Such antibodies comprise anti-idiotypic 
antibodies to the HLF protein-specific antibody and can be used to immunize an 
5 animal to induce formation of further HLF protein-specific antibodies. 

It will be appreciated that Fab and F(ab')2 and other fragments of the 
antibodies of the present invention may be used according to the methods 
disclosed herein. Such fragments are typically produced by proteolytic 
cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin 

10 (to produce F(ab')2 fragments). Alternatively, HLF protein-binding fragments 

can be produced through the application of recombinant DNA technology or 
through synthetic chemistry. 

For in vivo use of anti-HLF in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced 
using genetic constructs derived from hybridoma cells producing the 
monoclonal antibodies described above. Methods for producing chimeric 
jntihodies arc known in the art (Morrison, Science 229: 1202; 1985); Oi, et ai, 
IU< 'Techniques 4:214; 1986; Cabilly, et ai, U.S. Patent No. 4,816,567; 
Taniguchi, et ai, EP 171496; Morrison, el at., EP 173494; Neuberger, et ai, 

:<> ^ WO 8601533; Robinson, et aL, WO 8702671 ; Boulianne, et aL, Nature 

312:643; 1984; Neuberger et al. t Nature 314:268; 1985). 

Disorders Related to the Regulation of Cell Growth 

Diagnosis 

The present inventors have discovered that HLF is apparently expressed 
25 detectably only in the amygdala, whole brain, and primary breast culture tissue. 

For a number of disorders related to the regulation of cell growth, substantially 
altered (increased or decreased) levels of HLF gene expression can be detected 
in tissues or other cells or bodily fluids (e.g., sera, plasma, urine, synovial fluid 
or spinal fluid) taken from an individual having such a disorder, relative to a 
30 "standard" HLF gene expression level, that is, the HLF expression level in such 

tissues or bodily fluids from an individual not having the disorder. Thus, the 
invention provides a diagnostic method useful during diagnosis of a disorder 
related to the regulation of cell growth, which involves measuring the 
expression level of the gene encoding the HLF protein in such tissues or other 
35 cells or bodily fluids from an individual and comparing the measured gene 



expression level with a standard HLF gene expression level, whereby an 
increase or decrease in the gene expression level compared to the standard is 
indicative of a such a disorder. 

In particular, it is believed that certain tissues in mammals with breast or 
brain cancers express significantly enhanced levels of the HLF protein and 
mRNA encoding the HLF protein when compared to a corresponding 
"standard" level. Further, it is believed that enhanced levels of the HLF protein 
can be detected in certain body fluids (e.g., sera, plasma, urine, and spinal 
fluid) from mammals with such a cancer when compared to sera from mammals 
of the same species not having the cancer. 

Thus, the invention provides a diagnostic method useful during 
diagnosis of a disorder of the regulation of cell growth, including several types 
of cancers which involves measuring the expression level of the gene encoding 
the HLF protein in tissues or other cells or bodily fluids from an individual and 
comparing the measured gene expression level with a standard HLF gene 
expression level, whereby an increase or decrease in the gene expression level 
compared to the standard is indicative of such a disorder. 

Where a diagnosis of a disorder of the regulation of cell growth, 
including diagnosis of a tumor, has already been made according to 
conventional methods, the present invention is useful as a prognostic indicator, 
whereby patients exhibiting enhanced HLF gene expression will experience a 
worse clinical outcome relative to patients expressing the gene at a level nearer 
the standard level. 

By "assaying the expression level of the gene encoding the HLF 
protein" is intended qualitatively or quantitatively measuring or estimating the 
level of the HLF protein or the level of the mRNA encoding the HLF protein in 
a first biological sample either directly (e.g., by determining or estimating 
absolute protein level or mRNA level) or relatively (e.g., by comparing to the 
HLF protein level or mRNA level in a second biological sample). Preferably, 
the HLF protein level or mRNA level in the first biological sample is measured 
or estimated and compared to a standard HLF protein level or mRNA level, the 
standard being taken from a second biological sample obtained from an 
individual not having the disorder or being determined by averaging levels from 
a population of individuals not having a disorder of the regulation of cell 
growth. As will be appreciated in the art, once a standard HLF protein level or 
mRNA level is known, it can be used repeatedly as a standard for comparison. 
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By "biological sample" is intended any biological sample obtained from 
an individual, body fluid, cell line, tissue culture, or other source which 
contains HLF protein or mRNA. As indicated, biological samples include body 
fluids (such as sera, plasma, urine, synovial fluid and spinal fluid) which 
contain free HLF protein, or the extracellular or EGF domains of the HLF 
protein, cancerous tissue, and other tissue sources found to express complete 
HLF protein, or the extracellular or EGF domains of the HLF protein, or an 
HLF receptor. Methods for obtaining tissue biopsies and body fluids from 
mammals are well known in the art. Where the biological sample is to include 
mRNA, a tissue biopsy is the preferred source. 

The present invention is useful for diagnosis or treatment of various 
disorders of the regulation of cell growth in mammals, preferably humans. 
Such disorders include breast cancer, brain cancers, including neuroblastomas 
and glioblastomas, developmental disorders, ovarian cancer, endometrial 
cancer, some types of colon cancers, and the like. 

Total cellular RNA can be isolated from a biological sample using any 
suitable technique such as the single-step guanidinium-thiocyanate-phenol- 
chloroform method described by Chomczynski and Sacchi {Anal. Biochem. 
162:156-159; 1987). Levels of mRNA encoding the HLF protein are then 
assayed using any appropriate method. These include Northern blot analysis, 
S 1 nuclease mapping, the polymerase chain reaction (PGR), reverse 
transcription in combination with the polymerase chain reaction (RT-PCR), and 
reverse transcription in combination with the ligase chain reaction (RT-LCR). 

Assaying HLF protein levels in a biological sample can occur using 
antibody-based techniques. For example, HLF protein expression in tissues 
can be studied with classical irnmunohistoiogical methods (Jalkanen, M., et ai, 
7. Cell. Biol. 707:976-985 (1985); Jalkanen, M., et ai, J. Cell . BioL 
705:3087-3096 (1987)). Other antibody-based methods useful for detecting 
HLF protein gene expression include immunoassays, such as the enzyme linked 
immunosorbent assay (ELISA) and the radioimmunoassay (RIA). Suitable 
antibody assay labels are known in the an and include enzyme labels, such as, 
glucose oxidase, and radioisotopes, such as iodine ( 125 I, l2) I), carbon ( ,4 C), 
sulfur ( VS S), tritium ( 3 H)» indium ( M2 In), and technetium (" m Tc) ; and 
fluorescent labels, such as fluorescein and rhodamine, and biotin. 

In addition to assaying HLF protein levels in a biological sample 
obtained ; from an individual, HLF protein can also be detected in vivo by 
imaging. Antibody labels or markers for in vivo imaging of HLF protein 
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include those detectable by X-radiography, NMR or ESR. For X-radiography, 
suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers 
for NMR and ESR include those with a detectable characteristic spin, such as 
5 deuterium, which may be incorporated into the antibody by labeling of nutrients 

for the relevant hybridoma. 

A HLF protein-specific antibody or antibody fragment which has been 
labeled with an appropriate detectable imaging moiety, such as a radioisotope 
(for example, lM I, " 2 In, Tc). a radio-opaque substance, or a material 
,0 detectable by nuclear magnetic resonance, is introduced (for example, 

parenterally. subcutaneously or intraperitoneally) into the mammal to be 
examined for immune system disorder. It will be understood in the art that the 
size of the subject and the imaging system used will determine the quantity of 
imaging moiety needed to produce diagnostic images. In the case of a 
, 5 radioisotope moiety, for a human subject, the quantity of radioactivity injected 

will normally range from about 5 to 20 millicuries of ""Tc. The labeled 
antibody or antibody fragment will then preferentially accumulate at the location 
of cells which contain HLF protein. In vivo tumor imaging is described in 
S.W. Burchiel et al., "Irnmunopharmacokinetics of Radiolabeled Antibodies 
20 and Their Fragments" (Chapter 1 3 in Tumor Imaging: The Radiochemical 

Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds.. Masson 
Publishing Inc. (1982)). 

Treatment 

As noted above, HLF polynucleotides and polypeptides are useful for 
diagnosis of conditions involving abnormally high or low expression of HLF 
activities. Given the cells and tissues where HLF is expressed as well as the 
activities modulated by HLF, it is readily apparent that a substantially altered 
(increased or decreased) level of expression of HLF in an individual compared 
30 to the standard or "normal" level produces pathological conditions related to the 

bodily system(s) in which HLF is expressed and/or is active. 

, It will also be appreciated by one of ordinary skill that, since the HLF 
protein of the invention is a member of the EGF family the extracellular domain 
of the protein may be released in soluble form from the cells which express the 
35 HLF by proteolytic cleavage. Therefore, when HLF soluble extracellular 

domain is added from an exogenous source to cells, tissues or the body of an 
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individual, the protein will exert its physiological activities on its target cells of 
that individual. 

Therefore, it will be appreciated that conditions caused by a decrease in 
the standard or normal level of HLF activity in an individual, particularly 

5 disorders of cellular growth regulation, can be treated by administration of HLF 

polypeptide (in the form of soluble extracellular domain or cells expressing the 
complete protein. Thus, the invention also provides a method of treatment of an 
individual in need of an increased level of HLF activity comprising 
administering to such an individual a pharmaceutical composition comprising an 

10 amount of an isolated HLF polypeptide of the invention, particularly an 

extracellular form of the HLF protein of the invention, effective to increase the 
HLF activity level in such an individual. 

An individual who is in need of increased HLF activity will not express 
a sufficient amount of functional HLF protein, administration of recombinant 

1 5 HLF protein, or more simply, of the active extracellular or the active EGF 

domain, to such an individual will result in the presence of a sufficient 
concentration of HLF activity in the bloodstream. In addition, an individual 
who has an abnormally increased level of HLF activity, will require the use of 
an HLF antibody or antagonist, as described in the present invention. The use 

20 - of such HLF antagonists will result in a therapeutic lowering of the effective 

level of HLF activity in the bloodstream. As a result of such treatment, the 
affected individual will have an effective concentration of HLF activity which is 
much closer to that of what is deemed "normal". Those of skill in the art will 
recognize other indications where the ability to therapeutically adjust the level of 

25 effective HLF activity is desirable. 

It will be further appreciated by one of ordinary skill that HLF may be 
used as an additive or supplement for the in vitro culture of certain types of 
eukaryotic cells. Many cell types, including primary cell cultures, are highly 
fastidious and require a complex mixture of additives to the standard culture 

30 medium to result in successful culture and survival of the cells. A number of 

known growth factors and related molecules are currently used as supplements 
to the medium of various cells. Such factors may include molecules as 
epidermal growth factor (EGF), keratinocyte growth factor (KGF), acidic 
fibroblast growth factor (aFGF), insulin-like growth factor (IGF)-L nerve 

35 growth factor (NGF), and many others. Despite the availability and use of the 

collection of growth factors listed above, a large number of cells and cell types 
remain unculturable. either at all or for an extended period of time. Since 
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expression of HLF appears to be limited to the amygdala, whole brain, and 
primary breast culture tissue or to other neural cells and tissues, HLF is useful 
as an additive or growth factor in the culture of neural and a number of other 
cells and cell types. 

5 It will be further appreciated by the skilled artisan, that many cells and 

cell types require the absence of a specific growth factor or related molecule 
from the culture medium. In the case of culturing cells which require the 
absence of HLF from the culture medium, antagonists or antibodies of HLF 
described herein may be used to bind to and remove HLF from culture medium 

10 preparations thus resulting in "HLF-free" culture media. 

Formulations 

The HLF polypeptide composition will be formulated and dosed in a 
fashion consistent with good medical practice, taking into account the clinical 
, 5 condition of the individual patient (especially the side effects of treatment with 

HLF polypeptide alone), the site of delivery of the HLF polypeptide 
composition, the method of administration, the scheduling of administration, 
and other factors known to practitioners. The "effective amount" of HLF 
polypeptide for purposes herein is thus determined by such considerations. 
20 as a general proposition, the total pharmaceutical^ effective amount of 

HLF polypeptide administered parenterally per dose will be in the range of 
about l'lig/kgAtey to 10 mg/kg/day of patient body weight, although, as noted 
above, this will be subject to therapeutic discretion. More preferably, this dose 
is at least 0.01 mg/kg/day, and most preferably for humans between about 0.01 
25 and 1 mg/kg/day for the hormone. If given continuously, the HLF polypeptide 

is typically administered at a dose rate of about 1 p.g/kg/hour to about 50 
Hg/kg/hour. either by 1-4 injections per day or by continuous subcutaneous 
infusions, for example, using a mini-pump. An intravenous bag solution may 
also be employed. The length of treatment needed to observe changes and the 
30 interval following treatment for responses to occur appears to vary depending 

on the desired effect. 

Pharmaceutical compositions containing the HLF of the invention may 
be administered orally, rectally, parenterally, intracistemally, intravaginally, 
intraperitoneal^, topically (as by powders, ointments, drops or transdermal 
35 patch), bucally, or as an oral or nasal spray. By "pharmaceutically acceptable 

carrier" is meant a non-toxic solid, semisolid or liquid filler, diluent, 
encapsulating material or formulation auxiliary of any type. The term 
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"parenteral" as used herein refers to modes of administration which include 
intravenous, intramuscular, intraperitoneal, intrasternal, subcutaneous and 
intraarticular injection and infusion. 

The HLF polypeptide is also suitably administered by sustained-release 

5 systems. Suitable examples of sustained-release compositions include semi- 

permeable polymer matrices in the form of shaped articles, e.g., films, or 
mirocapsules. Sustained-release matrices include polylactides (U.S. Pat. No. 
3,773,919, EP 58,481), copolymers of L-glutamic acid and gamma-ethyl-L- 
glutamate (Sidman, U. et al., Biopolymers 22:547-556 (1983)), poly (2- 

10 hydroxyethyl methacrylate) (R. Langer et al., J. Biomed. Mater. Res. 75: 167- 

277 (1981), and R. Langer, Chem. Tech. 72:98-105 (1982)), ethylene vinyl 
acetate (R. Langer et al.. Id.) or poly-D- (-)-3-hydroxybutyric acid (EP 
133,988). Sustained-release HLF polypeptide compositions also include 
hposomally entrapped HLF polypeptide. Liposomes containing HLF 

]5 polypeptide are prepared by methods known per se: DE 3,218,121; Epstein et 

al.. Proc. Natl. Acad. Sci. (USA) 82:3688-3692 (1985); Hwang et al., Proc. 
Natl. Acad. Sci. (USA) 77:4030-4034 (1980); EP 52,322; EP 36,676; EP 
88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83-1 18008; U.S. Pat. 
Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 

2Qr , , are of the small (about 200-800 Angstroms) unilamellar type in which the lipid 
content is greater than about 30 mol. percent cholesterol, the selected proportion 
being adjusted for the optimal HLF polypeptide therapy. 

For parenteral administration, in one embodiment, the HLF polypeptide 
is formulated generally by mixing it at the desired degree of purity, in a unit 

25 dosage injectable form (solution, suspension, or emulsion), with a 

pharmaceutical^ acceptable carrier, i.e., one that is non-toxic to recipients at the 
dosages and concentrations employed and is compatible with other ingredients 
of the formulation. For example, the formulation preferably does not include 
oxidizing agents and other compounds that are known to be deleterious to 

30 polypeptides. 

Generally, the formulations are prepared by contacting the HLF 
polypeptide uniformly and intimately with liquid carriers or finely divided solid 
carriers or both. Then, if necessary, the product is shaped into the desired 
formulation. Preferably the carrier is a parenteral carrier, more preferably a 

35 solution that is isotonic with the blood of the recipient. Examples of such 

carrier vehicles include water, saline, Ringer's solution, and dextrose solution. 



WO 98/57989 



59 



PCT/US98/12403 



Non-aqueous vehicles such as fixed oils and ethyl oleate are also useful herein, 
as well as liposomes. 

The carrier suitably contains minor amounts of additives such as 
substances that enhance isotonicity and chemical stability. Such materials are 
non-toxic to recipients at the dosages and concentrations employed, and include 
buffers such as phosphate, citrate, succinate, acetic acid, and other organic acids 
or their salts; antioxidants such as ascorbic acid; low molecular weight (less than 
about ten residues) polypeptides, e.g., polyarginine or tripeptides; proteins, 
such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers 
such as polyvinylpyrrolidone; amino acids, such as glycine, glutamic acid, 
aspartic acid, or arginine; monosaccharides, disaccharides, and other 
carbohydrates including cellulose or its derivatives, glucose, manose, or 
dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as 
polysorbates, poloxamers, or PEG. 

The HLF polypeptide is typically formulated in such vehicles at a 
concentration of about 0. 1 mg/ml to 100 mg/ml, preferably 1-10 mg/ml, at a pH 
of about 3 to 8. It will be understood that the use of certain of the foregoing 
excipients, earners, or stabilizers will result in the formation of HLF 

polypeptide salts. 

HLF polypeptide to be used for therapeutic administration must be 
sterile. Sterility is readily accomplished by filtration through sterile filtration 
membranes (e.g., 0.2 micron membranes). Therapeutic HLF polypeptide 
compositions generally are placed into a container having a sterile access port, 
for example, an intravenous solution bag or vial having a stopper pierceable by 
a hypodermic injection needle. 

HLF polypeptide ordinarily will be stored in unit or multi-dose 
containers, for example, sealed ampoules or vials, as an aqueous solution or as 
a lyophilized formulation for reconstitution. As an example of a lyophilized 
formulation, 10-ml vials are filled with 5 ml of sterile-filtered 1% (w/v) aqueous 
HLF polypeptide solution, and the resulting mixture is lyophilized. The 
infusion solution is prepared by reconstituting the lyophilized HLF polypeptide 
using bacteriostatic Water-for-Ixijection. 

The invention also provides a pharmaceutical pack or kit comprising one 
or more containers filled with one or more of the ingredients of the 
pharmaceutical compositions of the invention. Associated with such container(s) 
can be a notice in the form prescribed by a governmental agency regulating the 
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manufacture, use or sale of pharmaceuticals or biological products, which notice 
reflects approval by the agency of manufacture, use or sale for human 
administration. In addition, the polypeptides of the present invention may be 
employed in conjunction with other therapeutic compounds. 

5 

Agonists and Antagonists - Assays and Molecules 

The invention also provides a method of screening compounds to 
identify those which enhance or block the action of HLF on cells, such as its 
interaction with HLF-binding molecules such as receptor molecules. An agonist 
10 is a compound which increases the natural biological functions of HLF or which 

functions in a manner similar to HLF, while antagonists decrease or eliminate 
such functions. 

In another aspect of this embodiment the invention provides a method 
for identifying a receptor protein or other ligand-binding protein which binds 
1 5 specifically to a HLF polypeptide. For example, a cellular compartment, such 

as a membrane or a preparation thereof, may be prepared from a cell that 
expresses a molecule that binds HLF. The preparation is incubated with labeled 
HLF. HLF and complexes of HLF bound to the receptor or other binding 
protein are isolated and characterized according to routine methods known in the 
20 art. Alternatively, the HLF polypeptide may be bound to a solid support so that 

binding molecules solubilized from cells are bound to the column and then 
eluted and characterized according to routine methods. 

In the assay of the invention for agonists or antagonists, a cellular 
compartment, such as a membrane or a preparation thereof, may be prepared 
25 from a cell that expresses a molecule that binds HLF, such as a molecule of a 

sicnaling or regulatory pathway modulated by HLF. The preparation is 
incubated with labeled HLF in the absence or the presence of a candidate 
molecule which may be a HLF agonist or antagonist. The ability of the 
candidate molecule to bind the binding molecule is reflected in decreased 
30 binding of the labeled ligand. Molecules which bind gratuitously, i.e., without 

inducing the effects of HLF on binding the HLF binding molecule, are most 
likely to be good antagonists. Molecules that bind well and elicit effects that are 
the same as or closely related to HLF are agonists. 

HLF-like effects of potential agonists and antagonists may by measured, 
35 for instance, by determining activity of a second messenger system following 

interaction of the candidate molecule with a cell or appropriate cell preparation, 
and comparing the effect with that of HLF or molecules that elicit the same 



61 



effects as HLF. Second messenger systems that may be useful in this regard 
include but are not limited to AMP guanylate cyclase, ion channel or 
phosphoinositide hydrolysis second messenger systems. 

Another example of an assay for HLF antagonists is a competitive assay 
that combines HLF and a potential antagonist with membrane-bound HLF 
receptor molecules or recombinant HLF receptor molecules under appropriate 
conditions for a competitive inhibition assay. HLF can be labeled, such as by 
radioactivity, such that the number of HLF molecules bound to a receptor 
molecule can be determined accurately to assess the effectiveness of the potential 
antagonist. 

Potential antagonists include small organic molecules, peptides, 
polypeptides and antibodies that bind to a polypeptide of the invention and 
thereby inhibit or extinguish its activity. Potential antagonists also may be small 
organic molecules, a peptide, a polypeptide such as a closely related protein or 
antibody that binds the same sites on a binding molecule, such as a receptor 
molecule, without inducing HLF-induced activities, thereby preventing the 
jeiion of HLF by excluding HLF from binding. 

Other potential antagonists include antisense molecules. Antisense 
technology can be used to control gene expression through antisense DN A or 
RNA or through triple-helix formation. Antisense techniques are discussed, for 
example, in Okano, J. Neurochem. 56: 560 (1991); "Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression." CRC Press, Boca Raton, FL (1988). 
1'riple helix formation is discussed in, for instance Lee et al. t Nucleic Acids 
Research 6: 3073 (1979); Cooney et al., Science 241: 456 (1988); and Dervan 
ei ai. Science 251: 1360 (1991). The methods are based on binding of a 
polynucleotide to a complementary DNA or RNA. For example, the 5' coding 
portion of a polynucleotide that encodes the mature polypeptide of the present 
invention may be used to design an antisense RNA oligonucleotide of from 
about 10 to 40 base pairs in length. A DNA oligonucleotide is designed to be 
complementary to a region of the gene involved in transcription thereby 
preventing transcription and the production of HLF. The antisense RNA 
oligonucleotide hybridizes to the mRNA in vivo and blocks translation of the 
mRNA molecule into HLF polypeptide. The oligonucleotides described above 
can also be delivered to cells such that the antisense RNA or DNA may be 
expressed in vivo to inhibit production of HLF protein. 

The agonists and antagonists may be employed in a composition with a 
pharmaceutically acceptable carrier, e.g., as described above. The antagonists 



may be employed for instance to inhibit the binding to, and activation of, cell 
surface receptor molecules belonging to the erbB family, as well as other known 
or unknown cell surface receptor molecules. Consequently, inhibition of such 
receptor binding will result in the indirect inhibition of stimulation of the 
corresponding signal transduction pathways. Many of the corresponding signal 
transduction pathways are involved in the regulation of cell division and 
growth. The genesis or acceleration of many cancers resulting from other 
related or unrelated mechanisms is linked to abnormally increased levels of cell 
surface receptor molecule stimulation. The activity of an HLF antagonist will 
result in blocking an abnormally increased level of HLF activity, and, in turn, 
diminish an abnormally increased level of the stimulation of signal transduction 
pathways. This situation will ultimately result in a return to the normal 
regulation of cell division and growth and a corresponding dimunition of the 
corresponding oncogenic state. Thus, HLF antagonists of the present invention 
may be employed to treat cancers. Any of the above antagonists may be 
employed in a composition with a pharmaceutical^ acceptable carrier, e.g., as 
hereinafter described. 

Gene Mapping 

The nucleic acid molecules of the present invention are also valuable for 
chromosome identification. The sequence is specifically targeted to and can 
hybridize with a particular location on an individual human chromosome. 
Moreover, there is a current need for identifying particular sites on the 
chromosome. Few chromosome marking reagents based on actual sequence 
data (repeat polymorphisms) are presently available for marking chromosomal 
location. The mapping of DNAs to chromosomes according to the present 
invention is an important first step in correlating those sequences with genes 
associated with disease. 

In certain preferred embodiments in this regard, the cDNA herein 
disclosed is used to clone genomic DNA of a HLF protein gene. This can be 
accomplished using a variety of well known techniques and libraries, which 
generally are available commercially. The genomic DNA then is used for in situ 
chromosome mapping using well known techniques for this purpose. 

In addition, in some cases, sequences can be mapped to chromosomes 
by preparing PGR primers (preferably 15-25 bp) from the cDNA. Computer 
analysis of the 3' untranslated region of the gene is used to rapidly select 
primers that do not span more than one exon in the genomic DNA, thus 
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complicating the amplification process. These primers are then used for PGR 
screening of somatic cell hybrids containing individual human chromosomes. 
Fluorescence in situ hybridization ("FISH") of a cDNA clone to a metaphase 
chromosomal spread can be used to provide a precise chromosomal location in 
5 one step. This technique can be used with probes from the cDNA as short as 50 

or 60 bp. For a review of this technique, see Verma et ai, Human 
Chromosomes: A Manual Of Basic Techniques, Pergamon Press, New York 
(1988). 

Once a sequence has been mapped to a precise chromosomal location, 
10 the physical position of the sequence on the chromosome can be correlated with 

genetic map data. Such data are found, for example, in V. McKusick, 
Mendelian Inheritance In Man, available on-line through Johns Hopkins 
University. Welch Medical Library. The relationship between genes and 
diseases that have been mapped to the same chromosomal region are then 
15 identified through linkage analysis (coinheritance of physically adjacent genes). 

Next, it is necessary to determine the differences in the cDNA or 
genomic sequence between affected and unaffected individuals. If a mutation is 
observed in some or all of the affected individuals but not in any normal 
individuals, then the mutation is likely to be the causative agent of the disease. 
20 Having generally described the invention, the same will be more readily 

understood by reference to the following examples, which are provided by way 
of illustration and are not intended as limiting. 

Examples 

Example 1(a): Expression and Purification of "GST-tagged" 

25 EG F -like Domain of HLF in E. coli 

The bacterial expression vector pGEX-3X was used for bacterial 
expression in this example (Pharmacia, Inc., Uppsala, Sweden). pGEX-3X 
encodes ampicillin antibiotic resistance ("Ampr") and contains a bacterial origin 
of replication ("ori"), an IPTG inducible promoter, and a sequence that encodes 

30 an N-terminal, in frame, glutathione S-transferase (GST) lag that allows affinity 

purification using one of the GST Purification Modules, and several suitable 
single restriction enzyme cleavage sites. These elements are arranged such thai 
an inserted DN A fragment encoding a polypeptide expresses that polypeptide 
with an N-terminal GST-fusion protein. 

i 
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The DNA sequence encoding the desired portion of the HLF protein 
comprising the EGF-like domain of the HLF amino acid sequence was 
amplified from the deposited cDNA clone using PGR oligonucleotide primers 
which annealed to the amino and carboxy terminal sequences of the desired 
portion of the HLF protein. Additional nucleotides containing restriction sites to 
facilitate cloning in the pGEX-3X vector were added to the 5' and 3' primer 
sequences, respectively. For cloning the EGF-like domain of the HLF protein, 
the 5* primer had the sequence 5* GGCGGAJCCCTCTTCTTCCTCCTCC 3* 
(SEQ ID NO:5) containing the underlined Bam HI restriction site followed by 
16 nucleotides of the amino terminal coding sequence of the EGF-like domain 
of the HLF sequence in SEQ ID NO:2. The 3' primer had the sequence 
5' G G CGAATTCT AAACTTC 

TTCACTCTCCATGAATTCAATCCCC 3' (SEQ ID NO:6) containing the 
underlined Eco Rl restriction site followed by 33 nucleotides complementary to 
the 3" end of the EGF-like domain of the HLF DNA sequence in Figure 1 A. 

The amplified HLF DNA fragment and the vector pGEX-3X were 
digested with Bam HI and Eco Rl and the digested DNAs were then ligated 
together. Insertion of the HLF DNA into the restricted pGEX-3X vector placed 
the HLF protein coding region downstream from the IPTG-inducible promoter 
and in frame with an initiating AUG and the N-terminal GST fusion tag. 

The ligation mixture was transformed into competent E. coli cells using 
standard procedures such as those described in Sambrook et al., Molecular 
Cloning: a Laboratory Manual, 2nd Ed.: Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY (1989). Plasmid DNA was isolated from resistant 
colonies and the identity of the cloned DNA confirmed by restriction analysis, 
PCR and DNA sequencing. 

Clones containing the desired constructs were grown overnight ("O/N") 
in liquid culture in LB media supplemented with ampicillin (100 jig/ml). The 
O/N culture was used to inoculate a large culture, at a dilution of approximately 
1:25 to 1:250. The cells were grown to an optical density at 600 nm ("OD600") 

of approximately 0.4. lsopropyl-p-D-thiogalactopyranoside ("IPTG") was then 

added to a final concentration of 0.1 mM to induce transcription from the lac 
repressor sensitive promoter, by inactivating the lad repressor. Cells 
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subsequently were incubated further for 3 to 4 hours. Cells then were harvested 
by centrifugation, resuspended in IX PBS, and lysed by sonication. 

The expressed GST-HLF(EGF domain) fusion protein was purified 
using glutathione sepharose 4B essentially as described by the manufacturer 
(Pharmacia, Uppsala, Sweden). Briefly, cell lysates were combined with the 
glutathione sepharose 4B. The mixture was pelleted by centrifugation and 
washed. The GST fusion portion of the polypeptide was cleaved by the 
addition of thrombin site-specific protease for 18 hours. Following cleavage, 
thrombin was bound to p-Aminobenzmidine agarose beads. The 
thrombin-p-Aminobenzmidine agarose bead complexes and the GST-glutathione 
sepharose complexes were pelleted by centrifugation. The supernatant then 
contained the purified EGF domain of the HLF protein. Purity of the protein 
preparation was analyzed by SDS-PAGE. The purified protein was then stored 
frozen at -20° C. 

Example 2: Cloning and Expression of HLF protein in a 
Baculovirus Expression System 

In this illustrative example, the plasmid shuttle vector pA2GP is used to 
insert the cloned DNA encoding the mature protein, lacking its naturally 
associated secretory signal (leader) sequence, into a baculovirus to express the 
mature HLF protein, using a baculovirus leader and standard methods as 
described in Summers et al., A Manual of Methods for Baculovirus Vectors and 
Insect Cell Culture Procedures, Texas Agricultural Experimental Station Bulletin 
No. 1 555 ( 1 987). This expression vector contains the strong polyhedrin 
promoter of the Autographa californica nuclear polyhedrosis virus ( AcMNPV) 
followed by the secretory signal peptide (leader) of the baculovirus gp67 protein 
and convenient restriction sites such as Bam HI, Xba I and Asp 718. The 
polyadenylation site of the simian virus 40 (SV40) is used for efficient 
polyadcnylation. For easy selection of recombinant virus, the plasmid contains 
the beta-galactosidase gene from £. coli under control of a weak Drosophila 
promoter in the same orientation, followed by the polyadenylation signal of the 
polyhedrin gene. The inserted genes are flanked on both sides by viral : ; a .ismj 

sequences for cell-mediated homologous recombination with wild-type viral 
DNA to generate viable virus that expresses the cloned polynucleotide. 
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Many other baculovirus vectors could be used in place of the vector 
above, such as pAc373, pVL941 and pAcIMl, as one skilled in the art would 
readily appreciate, as long as the construct provides appropriately located 
signals for transcription, translation, secretion and the like, including a signal 
peptide and an in-frame AUG as required. Such vectors are described, for 
instance, in Luckow et al.. Virology 170:31-39 (1989). 

The cDNA sequence encoding the mature HLF protein in the deposited 
clone, lacking the AUG initiation codon and the naturally associated leader 
sequence shown in SEQ ID NO:2, is amplified using PCR oligonucleotide 
primers corresponding to the 5' and 3' sequences of the gene. The 5' primer 
has the sequence 5' GG C GG ATCC CCTCTTCTTCCTC CTCC- 1 * (SEQ ID 
NO:7) containing the underlined Bam HI restriction enzyme site followed by 16 
nucleotides of the sequence of the mature HLF protein shown in SEQ ID NO:2, 
beginning with the indicated N-terminus of the extracellular domain of the HLF 
protein. The 3' primer has the sequence 5 GGC GGTACC TAAACTTCTTCAC 
TCTCCATGAATTCAATCCCC 3' (SEQ ID NO:8) containing the underlined 
Asp 718 restriction site followed by 33 nucleotides complementary to the 3' 
coding sequence in Figure 1 A. 

The amplified fragment is isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The 
fragment then is digested with Bam HI and Asp 718 and again is purified on a 
1 % agarose gel. This fragment is designated herein Fl . 

The plasmid is digested with the restriction enzymes Bam HI and Asp 
718 and optionally, can be dephosphorylated using calf intestinal phosphatase, 
using routine procedures known in the art. The DN A is then isolated from a 1 % 
agarose gel using a commercially available kit ("Geneclean" BIO 101 Inc., La 
Jolla, Ca.). This vector DNA is designated herein "VI". 

Fragment Fl and the dephosphorylated plasmid VI are ligated together 
with T4 DNA ligase. E. coli HB101 or other suitable E. coli hosts such as XL- 
I Blue (Statagene Cloning Systems, La Jolla, CA) cells are transformed with 
the ligation mixture and spread on culture plates. Bacteria are identified that 
contain the plasmid with the human HLF gene by digesting DNA from 
individual colonies using Bam HI and Asp 718 and then analyzing the digestion 
product by gel electrophoresis. The sequence of the cloned fragment is 
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confirmed by DNA sequencing. This plasmid is designated herein 
pA2GPHLF. 

Five fig of the plasmid pA2GPHLF is co-transfected with 1.0 |ig of a 
commercially available linearized baculovirus DNA ("BaculoGold™ baculovirus 
DNA", Pharmingen, San Diego, CA), using the lipofection method described 
by Feigner et al., Proc. Natl Acad. ScL USA 84. 7413-7417 (1987). One |ig 
of BaculoGold™ virus DNA and 5 |lg of the plasmid pA2GPHLF are mixed in 
a sterile well of a microtiter plate containing 50 \x\ of serum-free Grace's 
medium (Life Technologies Inc., Gaithersburg, MD). Afterwards, 10 |il 
Lipofectin plus 90 |il Grace's medium are added, mixed and incubated for 15 
minutes at room temperature. Then the transfection mixture is added drop-wise 
to Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm tissue culture plate 
with 1 ml Grace's medium without serum. The plate is then incubated for 5 
hours at 27° C. The transfection solution is then removed from the plate and 1 
ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at 27° C for four days. 

After four days the supernatant is collected and a plaque assay is 
performed, as described by Summers and Smith, supra. An agarose gel with 
"Blue Gal" (Life Technologies Inc., Gaithersburg) is used to allow easy 
identification and isolation of gal-expressing clones, which produce blue-stained 
plaques. (A detailed description of a "plaque assay" of this type can also be 
found in the user's guide for insect cell culture and baculovirology distributed 
by Life Technologies Inc., Gaithersburg, page 9-10). After appropriate 
incubation, blue stained plaques are picked with the tip of a micropipettor (e.g., 
Eppendorf). The agar containing the recombinant viruses is then resuspended 
in a microcentrifuge tube containing 200 |il of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells 
seeded in 35 mm dishes. Four days later the supernatants of these culture 
dishes are harvested and then they are stored at 4° C. The recombinant virus is 
called V-HLF. 

To verify the expression of the HLF gene Sf9 cells are grown in Grace's 
medium supplemented with 10% heat-inactivated FBS. The cells are infected 
with the recombinant baculovirus V-HLF at a multiplicity of infection ("MOI") 
of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
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(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 |LiCi 
of " 5 S-methionine and 5 jiCi "S-cysteine (available from Amersham) are added. 
The cells are further incubated for 1 6 hours and then are harvested by 
centrifugation. The proteins in the supernatant as well as the intracellular 
proteins are analyzed by SDS-PAGE followed by autoradiography (if 
radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of 
purified protein may be used to determine the amino terminal sequence of the 
extracellular domain of the HLF protein. 

Example 3: Cloning and Expression of HLF in Mammalian Cells 
A typical mammalian expression vector contains the promoter element, 
which mediates the initiation of transcription of mRNA, the protein coding 
sequence, and signals required for the termination of transcription and 
polyadenylation of the transcript. Additional elements include enhancers, 
Kozak sequences and intervening sequences flanked by donor and acceptor sites 
for RNA splicing. Highly efficient transcription can be achieved with the early 
and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV, HTLVI, HIVI and the early promoter of the 
cytomegalovirus (CMV). However, cellular elements can also be used (e.g., 
the human actin promoter). Suitable expression vectors for use in practicing the 
present invention include, for example, vectors such as pSVL and pMSG 
(Pharmacia, Uppsala, Sweden), pRSVcat (ATCC 37152), pSV2dhfr (ATCC 
37146) and pBC12MI (ATCC 67109). Mammalian host cells that could be 
used include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and 
CI 27 cells. Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and 
Chinese hamster ovary (CHO) cells. 

Alternatively, the gene can be expressed in stable cell lines that contain 
the gene integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and 
isolation of the transfected cells. 

The transfected gene can also be amplified to express large amounts of 
the encoded protein. The DHFR (dihydrofolate reductase) marker is useful to 
develop cell lines that carry several hundred or even several thousand copies of 
the gene of interest. Another useful selection marker is the enzyme glutamine 
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synthase (GS) (Murphy et al., Biochem J. 227:277-279 (1991); Bebbington et 
al., Bio/Technology 70:169-175 (1992)). Using these markers, the mammalian 
cells are grown in selective medium and the cells with the highest resistance are 
selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for 
the production of proteins. 

The expression vectors pCl and pC4 contain the strong promoter (LTR) 
of the Rous Sarcoma Virus (Cullen et aL Molecular and Cellular Biology, 
438-447 (March, 1985)) plus a fragment of the CMV-enhancer (Boshart et aL, 
Cell 41:52 1-530 (1985)). Multiple cloning sites, e.g., with the restriction 
enzyme cleavage sites BamHI, Xbal and Asp718, facilitate the cloning of the 
gene of interest. The vectors contain in addition the 3' intron, the 
polyadenylation and termination signal of the rat preproinsulin gene. 

Example 3(a): Cloning and Expression in COS Cells 

The expression plasmid, pHLFHA, is made by cloning a portion of the 
cDNA encoding the extracelluar domain of the HLF protein into the expression 
vector pcDNAI/Amp or pcDNAIII (which can be obtained from Invitrogen, 
Inc.). To produce a soluble, secreted form of the polypeptide, the extracellular 
domain is fused to the secretory leader sequence of the human LL-6 gene. 

The expression vector pcDNAI/amp contains: ( 1) an E. coli origin of 
replication effective for propagation in E. coli and other prokaryotic cells; (2) an 
ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; 
(3) an SV40 origin of replication for propagation in eukaryotic cells; (4) a CMV 
promoter, a polylinker, an SV40 intron; (5) several codons encoding a 
hemagglutinin fragment (i.e., an "HA" tag to facilitate purification) followed by 
a termination codon and polyadenylation signal arranged so that a cDNA can be 
conveniently placed under expression control of the CMV promoter and 
operably linked to the SV40 intron and the polyadenylation signal by means of 
restriction sites in the polylinker. The HA tag corresponds to an epitope derived 
from the influenza hemagglutinin protein described by Wilson et al., Cell 
37:767 (1984). The fusion of the HA tag to the target protein allows easy 
detection and recovery of the recombinant protein with an antibody that 
recognizes the HA epitope. pcDNAIII contains, in addition, the selectable 
neomycin marker. 
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A DNA fragment encoding the extracellular domain of the HLF 
polypeptide is cloned into the polylinker region of the vector so that recombinant 
protein expression is directed by the CMV promoter. The plasrnid construction 
strategy is as follows. The HLF cDNA of the deposited clone is amplified 
5 using primers that contain convenient restriction sites, much as described above 

for construction of vectors for expression of HLF in E. coli. Suitable primers 
include the following, which are used in this example. The 5' primer, 
containing the underlined Bam HI site, a Kozak sequence, an AUG start codon, 
a sequence encoding the secretory leader peptide from the human IL-6 gene, and 
10 16 nucleotides of the 5' coding region of the extracellular domain of the HLF 

polypeptide, has the following sequence: 
5' G C C GG AT CC G CCA CC ATG A AC 

TCCTTCTCCACAAGCGCCTTCGGTCCAGTTGCCTTCTCCCTGGGGCT 
GCTCCTGGTGTTGCCTGCTGCCTTCCCTGCCCCAGTCTCTTCTTCCTC 
, 5 CTCC 3' (SEQ ID NO:9). The 3' primer, containing the underlined Xba I and 

33 of nucleotides complementary to the 3* coding sequence immediately before 
the stop codon, has the following sequence: 
5' G G C TCT AG A T AAACTTCTTC AC 
TCTCCATGAATTCAATCCCC 3' (SEQ ID NO: 10). 

20 The PCR amplified DNA fragment and the vector, pcDNAI/Amp, are 

digested with Bam HI and Xba I and then ligated. The ligation mixture is 
transformed into E. coli strain SURE (available from Stratagene Cloning 
Systems, 1 1099 North Torrey Pines Road, La Jolla, CA 92037), and the 
transformed culture is plated on ampicillin media plates which then are incubated 

25 to a,,ow growth of ampicillin resistant colonies. Plasrnid DNA is isolated from 

resistant colonies and examined by restriction analysis or other means for the 
presence of the fragment encoding the extracellular domain of the HLF 
polypeptide 

For expression of recombinant HLF, COS cells are transfected with an 
30 expression vector, as described above, using DEAE-DEXTRAN, as described, 

for instance, in Sambrook et aL Molecular Cloning: a Laboratory Manual Cold 
Spring Laboratory Press, Cold Spring Harbor, New York (1989). Cells are 
incubated under conditions for expression of HLF by the vector. 

Expression of the HLF-HA fusion protein is detected by radiolabeling 
35 and immunoprecipitation, using methods described in, for example Harlow et 
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aL Antibodies: A Laboratory Manual, 2nd Ed. ; Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, New York (1988). To this end, two days after 
transfection, the cells are labeled by incubation in media containing 35 S-cysteine 
for 8 hours. The cells and the media are collected, and the cells arc washed and 
the lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 
0.1% SDS, 1% NP-40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by 
Wilson et al. cited above. Proteins are precipitated from the cell lysate and from 
the culture media using an HA-specific monoclonal antibody. The precipitated 
proteins then are analyzed by SDS-PAGE and autoradiography. An expression 
product of the expected size is seen in the cell lysate, which is not seen in 
negative controls. 

Example 3(b): Cloning and Expression in CHO Cells 

The vector pC4 is used for the expression of HLF polypeptide. Plasmid 
pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37146). 
To produce a soluble, secreted form of the polypeptide, the extracellular domain 
is fused to the secretory leader sequence of the human IL-6 gene. The plasmid 
contains the mouse DHFR gene under control of the S V40 early promoter. 
Chinese hamster ovary- or other cells lacking dihydrofolate activity that are 
transfected with these plasmids can be selected by growing the cells in a 
selective medium (alpha minus MEM, Life Technologies) supplemented with 
the chemotherapeutic agent methotrexate. The amplification of the DHFR genes 
in cells resistant to methotrexate (MTX) has been well documented (see e." 
Alt, F. W.: Kellems, R. M., Bertino, J. R., and Schimke, R. T., 1978, J. 
Biol Chem. 255:1357-1370, Hamlin, J. L. and Ma, C. 1990, Biochem. et 
Biophys. Acta. 7097:107-143, Page, M. J. and Sydenham, M. A. 1991, 
Biotechnology 9:64-68). Cells grown in increasing concentrations of MTX 
develop resistance to the drug by overproducing the target enzyme, DHFR, as a 
result of amplification of the DHFR gene. If a second gene is linked to the 
DHFR gene, it is usually co-amplified and over-expressed. It is known in the 
art that this approach may be used to develop cell lines carrying more than 1 ,000 
copies of the amplified gene(s). Subsequently, when the methotrexate is 
withdrawn, cell lines are obtained which contain the amplified gene integrated 
into one or more chromosome(s) of the host cell. 
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Plasmid pC4 contains for expressing the gene of interest the strong 
promoter of the long terminal repeat (LTR) of the Rouse Sarcoma Virus 
(Cullen, et al., Molecular and Cellular Biology, March 1985:438-447) plus a 
fragment isolated from the enhancer of theimmediate early gene of human 
cytomegalovirus (CMV) (Boshart et al., Cell 47:521-530 ( 1985)). Downstream 
of the promoter are the following single restriction enzyme cleavage sites that 
allow the integration of the genes: BamHI, Xba I, and Asp718. Behind these 
cloning sites the plasmid contains the 3* intron and polyadenylation site of the 
rat preproinsulin gene. Other high efficiency promoters can also be used for the 
expression, e.g., the human B-actin promoter, the SV40 early or late promoters 
or the long terminal repeats from other retroviruses, e.g., HIV and HTLVI. 
Clontech's Tet-Off and Tet-On gene expression systems and similar systems 
can be used to express the HLF polypeptide in a regulated way in mammalian 
cells (Gossen, M., & Bujard, H. 1992, Proc. Natl. Acad. ScL USA 59:5547- 
555 1 ). For the polyadenylation of the mRNA other signals, e.g., from the 
human growth hormone or globin genes can be used as well. Stable cell lines 
carrying a gene of interest integrated into the chromosomes can also be selected 
upon co-transfection with a selectable marker such as gpt, G418 or 
hygromycin. It is advantageous to use more than one selectable marker in the 
beginning, e.g., G418 plus methotrexate. 

The plasmid pC4 is digested with the restriction enzymes Bam HI and 
Asp 718 and then dephosphorylated using calf intestinal phosphates by 
procedures known in the art. The vector is then isolated from a 1% agarose gel. 

The DNA sequence encoding the extracellular domain of the HLF 
polypeptide is amplified using PGR oligonucleotide primers corresponding to 
the 5' and 3' sequences of the desired portion of the gene. The 5' primer 
containing the underlined Bam HI site, a Kozak sequence, an AUG Stan codon, 
a sequence encoding the secretory leader peptide from the human IL-6 gene, 
and 16 nucleotides of the 5' coding region of the extracellular domain of the 
HLF polypeptide, has the following sequence (where Kozak is in italics): 
5 ' GCCOGAT^GCC^CCATGAACTCCTTCTCCACAAGCGCCTTCGGT 
CCAGTTGCCTTCTCCCTGGGGCTGCTCCTGGTGTTGCCTGCTGCCTT 
CCCTGCCCCAGTCTCTTCTTCCTCCTCC 3' (SEQ ID NO:9). The 3' 
primer, containing the underlined Asp 718 restriction site and 33 nucleotides 
complementary to the 3' coding sequence immediately before the stop codon as 
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shown in Figure 1A (SEQ ID NO:l), has the following sequence: 

5' GGC GGTACC TAAACTTCTTCArTrTrr ATH A ATTP A ATrrrr 3- 

(SEQ ID NO:8). 

The amplified fragment is digested with the endonucleases Bam HI and 
5 Asp 718 and then purified again on a 1 % agarose gel. The isolated fragment 

and the dephosphorylated vector are then ligated with T4 DNA ligase. E. coli 
HB101 or XL-1 Blue cells are then transformed and bacteria are identified that 
contain the fragment inserted into plasmid pC4 using, for instance, restriction 
enzyme analysis. 

I0 Chinese hamster ovary cells lacking an active DHFR gene are used for 

transfection. Five |ig of the expression plasmid pC4 is cotransfected with 0.5 
jig of the plasmid pSVneo using lipofectin (Feigner et al., supra). The plasmid 
pSV2-neo contains a dominant selectable marker, the neo gene from Tn5 
encoding an enzyme that confers resistance to a group of antibiotics including 

1 <s C \A 1 8. The cells are seeded in alpha minus MEM supplemented with 1 mg/ml 

(14 IN. After 2 days, the cells are trypsinized and seeded in hybridoma cloning 
plates (Greiner, Germany) in alpha minus MEM supplemented with 10, 25, or 
50 ng/ml of metothrexate plus 1 mg/ml G418. After about 10-14 days single 
clones arc trypsinized and then seeded in 6- well petri dishes or 10 ml flasks 

2(l using different concentrations of methotrexate (50 nM, 100 nM, 200 nM, 400 

nM. 800 nM). Clones growing at the highest concentrations of methotrexate are 
then transferred to new 6-well plates containing even higher concentrations of 
methotrexate (1 ^iM, 2 |lM, 5 |lM, 10 mM, 20 mM). The same procedure is 
repeated until clones are obtained which grow at a concentration of 100 - 200 

25 MM. Expression of the desired gene product is analyzed, for instance, by SDS- 

PAGE and Western blot or by reversed phase HPLC analysis. 

Example 4: Tissue distribution of HLF mRNA expression 

Northern blot analysis is carried out to examine HLF gene expression in 
human tissues, using methods described by, among others, Sambrook et al., 
30 cited above, A cDNA probe containing the entire nucleotide sequence of the 

HLF protein (SEQ ID NO: 1 ) is labeled with 32 P using the r^/prime™ DNA 
labeling system (Amersham Life Science), according to manufacturer's 
instructions. After labeling, the probe is purified using a CHROMA 
SPIN- 100™ column (Clontech Laboratories, Inc.), according to manufacturer's 
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protocol number PT1200- 1 . The purified labeled probe is then used to examine 
various human tissues for HLF mRNA. 

Multiple Tissue Northern (MTN) blots containing various human tissues 
(H) or human immune system tissues (IM) are obtained from Clontech and are 
examined with the labeled probe using ExpressHyb™ hybridization solution 
(Clontech) according to manufacturer's protocol number PT1 190-1. Following 
hybridization and washing, the blots are mounted and exposed to film at -70° C 
overnight, and films developed according to standard procedures. 

Example 5: Analysis of erbB Receptor Family Activation: 

To test for the ability of recombinant EOF domain of the HLF protein 
(as produced in Example 1) to activate erbB family members, a tyrosine kinase 
activation assay was used as follows. In this analysis, a human breast cancer 
cell line (MCF-7) was allowed to become quiescent by extended culture in low 
serum medium. Exogenous recombinant EGF domain of the HLF protein (10 
mg.mL) or recombinant heregulin (0. 1 mg.mL) were added to the growth 
medium, and cell culture was continued in the presence or absence of 
exogenous protein for 30 minutes. Cells were harvested and lysed by the 
addition of SDS-containing sample buffer (1% SDS, 0.1 5M Tris, pH 8.6, 5% 
BME, and 1 mM sodium ortho-vanadate). 

Cell lysates were then subject to SDS-PAGE on 16-20% Tris-glycine 
gradient gels (Novex). Subsequently, electrophoretically separated proteins 
were transferred to a Hybond ECL nitrocellulose membrane (Amersham). 
Tyrosine phosphate containing proteins were identified by immunoblotting 
using anti-phosphotyrosine antibodies. 

As shown in Figure 4, there is a clear increase in the tyrosine 
phosphorylation of proteins in the size range of approximately 185 kDa in 
samples prepared from cultures which were grown in medium which contained 
the recombinant EGF domain of the HLF protein. The erbB family of cell 
surface receptor molecules consists of at least four members, all of which are 
roughly the molecular mass of the proteins observed to increase in tyrosine 
phosphorylation in this analysis. Furthermore, treatment of MCF-7 cells with 
recombinant heregulin in this analysis produced a similar result with regard to a 
change in the tyrosine phosphorylation state of cellular proteins. 
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These results strongly suggest that recombinant EGF domain of the 
HLF protein was able to activate phosphorylation of at least one of the members 
of the erbB family of cell surface receptors expressed in these cells. 

Example 6: HLF in Breast Cancer Cells, Activation of Multiple 

erbB Proteins: 

INTRODUCTION 

Increased activity of members of the erbB family has been implicated in 
the development of cancer. Different molecular mechanisms of activation have 
been identified. The ligands of the EGF/Hercgulin family are inappropriately 
expressed in breast cancers, EGF. a-TGF, amphiregulin and heregulin are 
expressed in breast cancers containing appropriate receptors thus leading to 
autocrine growth stimulation. The importance of autocrine growth stimulation 
is required for the transformation of NIH/3T3 cells with high levels of EGF 
receptor, since full morphological transformation requires the co-expression of 
a-TGF. The causative role of autocrine growth stimulation by a-TGF in breast 
cancer is demonstrated in experiments using transgenic animals where the 
expression of a-TGF acts syncrgistically to produce frequent breast cancers. 
These findings indicate that the co-incident expression of erbB receptor proteins 
with their ligands can result in aberrant cell growth. Moreover, in 20% of 
breast cancers amplification of the erbB2 gene results in overexpression of 
pl85erb B ' 2 . In these cancers activation of signalling has been thought to be 
independent of ligand activation. Overexpression of p!85erb n 2 is an oncogenic 
event in experimental systems. To date no ligand has been isolated that binds 
only to erbBl. However, recent studies show that erbB2 can form part of a 
receptor for heregulin. Overexpression of the erbB 1 (EGFR) protein is 
common, although gene amplification is infrequent in breast cancer. 

The erbB receptors bind their ligands as dimers - formed from two 
identical erbB proteins (homodimers) or from two different proteins 
(heterodimers). EGF can bind homodimers of erbB 1 (EGFR) or heterodimers 
of erbB 1 and erbBl. Similarly, Heregulin lb can bind homodimers of erbBA or 
heterodimers of erbBl and erbB3. Other ligands of the EGF/Heregulin family 
have receptors formed by homo and heterodimers of erbB proteins. Ligand 
binding and dimer formation leads to increased autophosphorylation of the 
receptor proteins and substrates activating intracellular signalling pathways. 
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The cellular consequences of receptor stimulation by members of the 
EGF/Heregulin family vary with the ligand and cellular context. EGF and 
a-TGF can stimulate the growth of many cells in culture, but in cases of breast 
cancers that overexpress the EGF Receptor, EGF can be growth inhibitory at 
concentrations above approximately 10 nM. In a similar way, heregulin can 
both stimulate growth of some human cancer cells as well as inhibit those that 
overexpress erbBl. The display of erbB proteins on breast cancer cells is not 
uniform. Many cells lack one or more of the family and others greatly 
overexpress erbBl or er/?B2. Heregulin clearly has effects on cell morphology 
as evidenced by changes in. the actin cytoskeleton. Heregulin seems to play a 
number of specialized roles in appropriate regulation the neuro-muscular 
junction, between neuronal and glial cells and in Schwann cell development. In 
prenatal development heregulin and erbBl and erbB4 control in morphogenesis 
of brain and heart. These findings indicate that members of the EGF/Heregulin 
family can have differing effects on cell phenotype. 

In this study we characterize a new ligand for the EGF/Heregulin family 
ol growth factors. Our results indicate that HLF binds and activates multiple 
members of the erbB family of receptors. We demonstrate that HLF is 
expressed in a human breast cancer cell line and can alter the growth of human 
breast cancer cell lines. These results indicate that HLF may have in vivo 
effects on the growth of the normal and malignant breast epithelial cells. 

RESULTS 

HLF contains an EGF like domain. 

The ligands of the EGF/Heregulin family have a well-defined sequence 
similarity which we used to identify HLF. Figure 5 shows a compilation of 
known ligands of the EGF/Heregulin family; all contain 6 cysteines. Between 
the fourth and sixth cysteine is the common EGF-like folding motif containing a 
conserved hydrophobic amino acid (Y37 (Tyr-68 of SEQ ID NO:2)) and a 
conserved glycine (G39 (Gly-70 of SEQ ID NO:2)). This region apparently 
forms a very stable core structure that is used in many extracellular proteins. 
Sequence similarity among the ligands is not limited to this folding motif. There 
is an exactly conserved arginine (R41 (Arg-72 of SEQ ID NO:2)) and 
hydrophobic amino acids at positions 14 and 16 (Asn-45 and Asp-46 of SEQ ID 
NO:2). A hydrophobic amino acid that is required for binding activity is found 
at positiori 46 or 47 (Leu-77 or Pro-78 of SEQ ID NO:2). The number of 
amino acids between cysteines is similar among the ligands with the notable 
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exception of loop B and loop C. Heregulins have a loop C which is three amino 
acids longer than the EGF-like ligands. The overall sequence similarity among 
the EGF/Heregulin family members is 19-42% (except between Hrgla and 
Hrglb which are derived from the same gene). Recently, NRG2 has been 
identified in rat brain. NRG2 is most closely related to Hrglb with sequence 
identity of 42% in the EGF like domain. 

Using a consensus sequence derived from EGF and Heregulin 
sequences we screened the HGS database of over 800,000 sequences. As 
shown in Figure 5, one cDNA encoded the HLF sequence which has 34-38% 
similarity to EGF and Heregulin Family within the EGF-like domain. 
Importantly, in the HLF sequence, all of the conserved cysteine residues, the 
R4 1 , and the G39 (Gly-70 and Arg-72 of SEQ ID NO:2, respectively) are 
exactly conserved. There is additional sequence conservation in NRG-3, 
notably hydrophobic amino acids at positions 13, 15, 37 and 46-47 (Leu-44, 
Asp-46, Tyr-68, and Leu-77-Pro-78 of SEQ ID NO:2, respectively). The 
length of the B and C loops are more similar to heregulin than EGF. Within the 
coding frame defined by our current cDNA clone there is a sequence of 
hydrophobic amino acids that is consistent with a transmembrane domain C- 
terminal to the EGF-like domain. This structure is similar to the transmembrane 
domains found in a-TGF, EGF, heregulin and other ligand precursor proteins 
(not shown in Figure 5). The sequence attributes of the HLF cDNA make it a 
strong candidate as encoding a novel growth factor binding one or several of the 
erbB family of receptors. NGR-2 is 36% identical to HLF in the highly 
conserved EGF-like domain therefore they are products of distinct genes. Don- 
1 has also been recently and independently identified by sequence similarity to 
EGF/Heregulin and is apparently the product of the same gene as NRG-2. 

Demonstration that HLF activates erbB family proteins. 

In order to obtain an initial estimation for the action of HLF as a ligand 
for erbB family of receptors we generated recombinant protein in E. coli using a 
GST fusion system (see also Example 1). The EGF-like domain of HLF was 
released and purified from the GST by thrombin cleavage. The resulting protein 
contained a single polypeptide when analyzed by SDS-PAGE. 

To test for the ability of recombinant HLF to activate receptors of the erbB 
family we used a tyrosine kinase activation assay. Tyrosine phosphate 
containing proteins were then identified by immunoblotting using 
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anti-phosphotyrosine antibodies. We found a clear increase in the tyrosine 
phosphorylation at ~p!85 when recombinant HLF is applied to MCF-7. In 
MCF-7 cells recombinant heregulin-lb results in a large increase in tyrosine 
phosphorylated proteins at about this size. These results indicate that 
recombinant HLF is able to activate phosphorylation of at least one of the 
members of the erbB family expressed in MCF-7 cells. 

HLF Activates multiple erbB proteins. 

In order to begin the analysis of the HLF receptor we used an 
experimental system where the display of erbB proteins can be controlled. The 
32D cell is a murine myeloid cell line which is devoid of expression of genes of 
the erbB family. Growth of 32D cells is dependent on the BL-3 present in 
WEHI conditioned media. When expression constructs encoding an erbB 
protein are introduced into 32D cells the resulting cell can survive in the absence 
of 1L-3 if an appropriate EGF/Heregulin family member is present. For 
example, the introduction of EGF Receptor expression leads to growth of 32D 
cells in the presence of EGF or aTGF and introduction of erbB4 allows growth 
in heregulin. Similar experimental systems have been used to examine the 
receptor specificity of the newly discovered NRG-2. We also show that growth 
of 32D cells in the presence of HLF occurs only when EGF Receptor or erbBA 
are present singly or when erbB2 and erbB3 are present in combination. The 
expression of erbBl or erbBZ alone does not lead to HLF induced growth. To 
confirm that this growth stimulation was the result of receptor activation we 
determined whether HLF induces the tyrosine phosphorylation of EGF 
Receptor, erbBA or erbB2 and erbB3 when expressed together. We show the 
appearance of an appropriate sized band when cell lysates of these 32D cells are 
probed by antiphosphotyrosine antibodies. These results are strong evidence 
that HLF can activate erbBl homodimers (the EGF receptor), erbBA 
homodimers, and erbBl + erbB3 heterodimers. 

The results of 32D experiments indicate that the receptor binding pattern 
of HLF is complex. In order to confirm that HLF activates proteins other than 
erbBA in MCF-7 cells we immunoprecipitated erbB3 and determined the level of 
tyrosine phosphorylation by immunoblot. We also observed that erbB3 is 
phosphorylated on tyrosine as a consequence of HLF stimulation. 

Biological Activity of HLF. 
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The effects of the EGF/Heregulin family vary significantly. Differing 
cellular phenotypescan be induced by different ligands in the same cell system 
and the same ligand can cause differing effects among different cells. Mitogenic 
activity of HLF has been detected in 32D cell experiments. The MCF-7 cell is 
dependent on estrogens in the media either in the form of phenol red or present 
in the fetal bovine serum. Little proliferation is seen in phenol red free media 
containing serum treated with charcoal to remove steroids. Heregulin is able to 
promote growth in the absence of estrogen. When HLF is added there is also a 
clear growth stimulation. Growth inhibitory effects have also been observed. 
HLF inhibits the growth of the breast cancer cell line MDA-MB-468. These 
cells overexpress the EGF receptor and can be stimulated by EGF at low 
concentrations (<10 nM) and growth inhibited at higher concentrations (>10 
nM). HLF was found to inhibit growth of MDA-MB-468 under conditions 
similar to those producing growth stimulation of 32D cells containing EGF 
Receptor. No growth suppression or stimulation are seen when HLF is applied 
to MCF-7 cells when they are grown in media containing agonists for the 
estrogen receptor. 

HLF MRNA expression in breast cancer. 

Preliminary experiments using northern blotting methods showed a 
weak signal for HLF mRNA in adult brain with a size of approximately 2 kD 
(data not shown). Similar northern blot results are reported in the recent HLF 
study. Because of the weakness of this signal we have used RT-PCR to detect 
HLF mRNA. We have confirmed expression in the brain and detect equivalent 
signals in samples of normal and breast cancer tissue. RT-PCR employed two 
primer sets. The two primer sets generated concordant results. This indicates 
that the bands observed by RT-PCR were due to actual HLF mRNA. In 
addition all assays included control reactions lacking reverse transcriptase in 
order to detect the presence of contaminating DNA. The observed band at 340 
bp corresponds to the predicted size based on the HLF cDNA. It was cloned 
sequenced and shown to contain HLF coding information. Bands at 500 bp and 
120 bp were also sequenced. These do do not contain HLF coding information 
and thus likely represent mispriming by the RT-PCR oligonucleotides on 
unrelated mRNAs. These results are strong evidence that HLF can be 
expressed in human breast cancer cell lines. 



80 



DISCUSSION 

Our results suggest that HLF can bind and activate erbB 1 and 
heterodimers of erbBl + erbB3. Taken together the available data suggests that 
the precise receptor binding and activation profile of HLF is complex. Our 
results demonstrate that erbB3 can be phosphorylated as a consequence of HLF 
binding. The HLF induced increases in tyrosine phosphorylation on erbB3 
suggests that the erbB3 protein can be part of an HLF receptor. Our studies of 
32D cells supports this conclusion where erbBl is the other member of the 
heterodimeric receptor with erbB3. Still to be determined is whether HLF can 
bind to erbB 1 + erbB3 heterodimers or erbB3 + erbB4 heterodimers or erbBl + 
erbBA heterodimers. Our preliminary data does conclusively demonstrate that 
HLF is a new ligand for the erbB family of receptors. These results suggest 
that HLF may have a receptor specificity somewhat analogous to b-cellulin. 

In adult tissue expression levels of HLF are low but detectable using 
sensitive methods such as RT-PCR. HLF is expressed at the highest levels in 
brain where it is likely to play a critical role in morphogenesis. Interestingly, 
we identify HLF expression a breast cancer cell line, MCF-7, that clearly has 
receptors that can be activated by HLF. Our results also show that HLF can 
cause alteration of growth of MCF-7 cancer cells in vitro. The ability to cause 
growth of MCF-7 cells in the absence of estrogen is similar to that previously 
reported for heregulin. Our results suggest that effects on cell phenotype by 
HLF may depend on the cell line. MDA-MB-468 which has high levels of 
EGFR are growth inhibited by HLF in vitro. 

The results in this paper together with those recently reported earlier 
identify HLF as a new ligand for the erbB family of growth factor receptors and 
suggest a role for HLF in the growth regulation of normal and malignant breast 
epithelial cells. 

MATERIALS AND METHODS 

Preparation of Recombinant HLF. Preparation of recombinant HLF is 
also described in Example 1 . In the case of the protein produced in this 
Example, the coding segment containing the EGF-like domain of HLF 
(nucleotide 79 to 279 of HGS38) were amplified by PCR and inserted into the 
pGEX3 plasmid for expression as a fusion protein with bacterial glutathione S 
transferase. Protein was prepared using standard methods. Bacteria were 
cultured to an OD Ci00 of approximately 0.4 and induced to express recombinant 
protein by addition of 0.1 mM IPTG. Bacteria were collected by centrifugation 
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resuspended in IX PBS and lysed by sonication. Recombinant protein was 
collected by incubation with glutathione beads. After washing the recombinant 
HLF protein was cleaved from the GST bound to the beads by thrombin 
cleavage for 18 hours. Thrombin was removed by incubation with 
p-Aminobenzamidine agarose beads. Refolding followed the methods used for 
the preparation of recombinant antibody fragments. Briefly, recombinant HLF 
was denatured in 6 M guanidine HCL containing 65 mM DTE. This was 
rapidly diluted 100 fold to a final protein concentration of 100 jig/ml into 0.4 M 
Arginine, 0. 1 M Tris pH 8.0, 0.9mM oxidized glutathione 2.0 mM EDTA. 
Refolding was allowed to proceed for 24 hours at 4°C. Refolded protein was 
extensively dialyzed against PBS using 3000 kDa cutoff membranes. Protein 
preparations were stored at -20°C. 

Detection of receptor activation by phosphotyrosine immunoblot. 

Cells were starved (24 hours for MCF-7, 4 hours for 32D derived cell 
lines) before addition of the indicated amounts of growth factors. Total cell 
lysates were prepared by addition of SDS PAGE sample buffer (1% SDS, 
0. 15M Tris pH 8.6, 5% BME and 1 mM Sodium OrthoVanadate) directly to 
cells. Cell lysates and were run on 8-16% Tris-Glycine gradient gels (Novex). 
Proteins were transferred onto Hybond ECL nitrocellulose membranes 
(Amersham) and were immunoblotted with anti-phosphotyrosine MAb. 

32D cell experiments. 

32D cells containing expression constructs for erbBl. erbBl, erbB3, 
erbB4 and erbB2 and erbBZ together were grown in IL-3 containing (WEHI 
conditioned media) or Hrg- lb prior to the experiment. Expression of the erbB 
proteins was verified by FACS analysis using er^B-specific antisera. Cells (10 4 
per well) were plated in 24 well dishes in the absence of IL-3 containing media 
(DMEM, 10% FCS) or in the presence of the indicated growth factors, 
heregulin-lb (100 ng/ml) ; EGF (100 ng/ml), and HLF (10 |ig/ml). Cells were 
allowed to grow for 3 days and viable cells counted using a hemocytometer. 

Immunoprecipitation and Immunoblot of erbB3. 

Cells were plated in 80 cm 2 dishes (DMEM + 10% FCS) for until 80% 
confluent., Cells were then allowed to become quiescent in serum free media 
(DMEM) for 24 hours. Cells were then stimulated with the indicated growth 
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factors, heregulin-lb (1 |ig/ml). and HLF (10 ng/ml) for 15 minutes. Cells 
were lysed in 1% Triton XI 00 in PBS containing 1 mM Sodium orthovanadate. 
Nuclei were removed by centrifugation. e rbB3 proteins were immuno 
recipitated (2 hours at 40°C) using monoclonal anti-erZ?B3 antibodies 
5 (Neomarkers) and collected on protein A sepharose. Proteins were released by 

incubation in 1% SDS containing PAGE sample buffer at 100°C and 
electrophoresed on 8-16% gels (Novagen). Proteins were transferred to 
nitrocellulose. Proteins containing pTyr were detected using monoclonal 
anti-phosphotyrosine antibodies (Oncogene Science) and the ECL detection 
10 system (Arnersham). 
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Growth Assays. 

Cells were plated in IMEM+ 10% FBS at 3000 cells per well in 96 well 
dishes. Cells were allowed to become quiescent in serum free IMEM for 24 
hours and growth factors EGF (2 ng/ml) and HLF (10 \xg/rn\) were added to the 
media. Growth of cells at 1,3, and 5 days was monitored using the XTT assay 
method. XTT was added at 10 ng/ml in IMEM and PMS (1.5 mg/ml in PBS) to 
25% of volume of well for 4 hours at 37°C. OD monitored at 540 nm. 

Detection of HLF mRNA. 

Total RNA was extracted from cultured cells using the RNazolB method 
(Tel-Test. CS-104). The final RNA pellet was resuspended in 135 \x\ DEPC 
treated H : 0. DNase treatment was performed using the SNAP RNA isolation 
kit (Invitrogen, K1950-01). Briefly, 10X DNase buffer and RNase free DNase 
I was added to each sample and incubated for 20 min at 37 a C RNA 
purification was performed as indicated in the kit. Concentration of each sample 
was determined, samples were dried and resuspended to give a final 
concentration of 2 ^g/ul. 

RT-PCR was performed using 2 \ig of total RNA in the Gene Amp RNA 
PCR Core kit (Perkin Elmer, N808-0143). cDNA was synthesized using the 
downstream primer 5-CCA CGA TGA CAA TTC CAA AG-3' (SEQ ID 
NO:20). Samples were reverse transcribed 1 h at 37°C. RT was heat 
inactivated 5 min at 99°C, samples were cooled on ice. PCR was performed 
with the entire RT reaction using the upstream primer 5'-TAC CAC CAC CAC 
ACC AG A AA-3' (SEQ ID NO:2 1 ). The reaction was performed for 40 cycles, 
1 min at 94°C, 1 min 30 sec at 58°C, 2 min at 72 W C followed by an extension 
for 8 min at 72°C. Samples were elctrophoresed on an agarose gel and 
visualized with ethidium bromide staining. 

Confirmation of the sequence of the bands was performed by purifying 
the bands from agarose gel slices (Wizard PCR preps DNA purification system, 
Promeoa, A7170) and cloning into a TA vector (Invitrogen, K2000-JIO) for 
automated sequencing. Bands of unknown identity present in the reaction 
products were cloned and sequenced in a similar fashion. 
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It will be clear that the invention may be practiced otherwise than as 
particularly described in the foregoing description and examples. Numerous 
modifications and variations of the present invention are possible in light of the 
above teachings and, therefore, are within the scope of the appended claims. 

The entire disclosure of all publications (including patents, patent 
applications, journal articles, manuscripts, laboratory manuals, books, or other 
documents) cited herein are hereby incorporated by reference. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Young, Paul 

King, C. Richter 
Hijazi, Mai 
Ruben, Steve 

(ii) TITLE OF INVENTION: Hereguiin-Li ke Factor 

(iii) NUMBER OF SEQUENCES: 22 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 94 10 Key West Avenue 

(C) CITY: Rockville 

( D ) STATE: MD 

(E) COUNTRY: US 

(F) ZIP: 20850 

(v) COMPUTER READA3LE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS - DOS 

(D) SOFTWARE : Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
<B) FILING DATE: 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/049,942 

(B) FILING DATE: 17-JUN-1997 

( v i i i > ATTORNEY/ AGENT ~ I N FORMAT I ON : 
(A; NAME: Hoover, Kenley K. 

(B) REGISTRATION NUMBER : 40,302 

(C) REFERENCE /DOCKET NUMBER: PF383 

fix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 301-3093504 

(B) TELEFAX: 301-309-8439 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2199 base pairs 
(3) TYPE: nucleic acid 
(C; STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ixl FEATURE: 

(A) NAME /KEY : CDS 

( B } LOCATION: 2 . .4 75 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

C TCT TCT TCC TCC TCC GCT ACC ACC ACC ACA CCA GAA ACT AGC ACC <3 6 

Ser Ser Ser Ser Ser Ala Thr Thr Thr Thr Pro Glu Thr Ser Thr 
15 10 15 

AGC CCC AAA TTT CAT ACG ACG ACA TAT TCC ACA GAG CGA TCC GAG CAC 94 
Ser Pro Lys Phe His Thr Thr Thr Tyr Ser Thr Glu Arg Ser Giu His 
20 25 30 

TTC AAA CCC TGC CGA GAC AAG GAC CTT GCA TAC TGT CTC AAT GAT GGC 14 2 

Phe Lys Pro Cys Arg Asp Lys Asp Leu Ala Tyr Cys Leu Asn Asp Gly 
35 40 45 

GAG TGC TTT GTG ATC GAA ACC CTG ACC GGA TCC CAT AAA CAC TGT CGG 190 
Glu Cys Phe Val lie Glu Thr Leu Thr Gly Ser His Lys His Cys Arg 
50 55 60 

TGC AAA GAA GGC TAC CAA GGA GTC CGT TGT GAT CAA TTT CTG CCG AAA 2 33 

Cys Lys Glu Glv Tvr Gin Glv Val Arg Cvs Asp Gin Phe Leu Pro Lys 
65 " 70 75 

ACT GAT TCC ATC TTA TCG GAT CCA AAC CAC TTG GGG ATT GAA TTC ATG 28 6 

Thr Asp Ser lie Leu Ser Asp Pro Asn His Leu Gly He Glu Phe Met 
80 * 85 90 95 

GAG AGT GAA GAA GTT TAT CAA AGG CAG GTG CTG TCA ATT TCA TGT ATC 33 h 

Glu Ser Glu Glu Val Tyr Gin Arg Gin Val Leu Ser lie Ser Cys He 
100 105 110 

ATC TTT GGA ATT GTC ATC GTG GGC ATG TTC TGT GCA GCA TTC TAC TTC 38 2 

He Phe Glv lie Val He Val Glv Met Phe Cys Ala Ala Phe Tyr Phe 
115 " 120 125 

AAA AGC AAA AGG AAT ATT ACA GCA AAT TCT GTG TCT GAG GAA AGA TGG 4 30 

Lys Ser Lys Arg Asn lie Thr Ala Asn Ser Val Ser Giu Giu Arg Trp 
130 135 140 

AAG GGT CTG OCT TCC CAG GAG CCC AAT CTG CAA CAA GAC AAA TAA 4 75 

Lys Gly Leu Pro Ser Gin Glu Pro Asn Leu Gin Gin Asp Lys * 
145 150 155 

TGCCTAACAA TGGATTAATG ATGTCTACTA TTCTGCAACT TACATCTCAT TTCTTTCTAA 53 5 

TGCATTGGAC CAGAGAAATT TAAAACTCAA ATGAACTGTA AAGTTTCCAC ACTGACACTG 595 

TTGGGCTAAT AGTATTCCCA TGTGCAAGGC ATGCATCTTT TCTTCCCCAG AGCAATGCCT 655 

CTCATGAGAG AGCTAATGGT AT TGC AA TCA. GCTGCTGATT GTTTTCTCTG TTCCCATTTT 715 

CTGGGTGAAG GAAGAAAGAG C AAAAAAG T G TGTGCTTGTG AGAGAGGAGG GATGGTAGAT 77 5 

AGGCAGAGGC AGGCTCAGAA TGGAAGGACC ACG T ATC TTG GAA TAT TACT AAGTCAGGAC 8 35 

TTGAGTGAAA AAAG AC T AAA GG T AAG C AAA TTATAAAAGG AT TT AGG AAA CGCAGTCCGG 8 95 

TATTGGATAT TGC TT AAAG A AAATTCCCTT ATAAGTTTAT ACTTCCAAGA CTCTGAATTG 955 

GATTACTGCA AAC ATC AT T A. AGT GTT TCT A AT TT AAT CCC AT GAG AG TAA TGGAATCCTT 1015 
GCTCTGAGAC ATGCACTCTT ACTTTTTCAG GATGATTTAC CAGACTAGAA CCTCCTGATT 10" 5 
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tccccttttt tgtgtgtgtg aatgaacccc tgataaaatc ttgtggctgt aacatgctcc 1135 

ttaaaatgct gatatgatag atttattttt aacaataggc tatagattag ctgttaggaa 1195 

gcaaatagat tattacaaca ggattaaagc aactaagagt gc tag ag at a aaagtctccc 125 5 

aaataattgg aaagataaaa gaaatatctt aaaaaacaga gctacatcac actgatattg 1315 

taaattcaaa atgggtaatg aagctcaaag cctccaaagc ttgcagcaag tgctggtgaa 137 5 

ttgcttggga agatgcaact agtgtaatct tttacctttg ggtcaatgtt ctgattcttt 14 35 

tgcagcttct gctcacaaga ctgagcttgc ttgatggtat cgggaaagat atgaacattt 14 95 

tgcgtgtgcc tccacatgca gccaccacag tgtccgtgga agatagcttt tatgaacttc 15 55 

att t ac ag ag gaggaaatgg aggctcaaca agtttaggaa attattaggg tagcaaaact 1615 

ag7gggtagc agagtgggat tcaaatccca gtccctgtga tacaataagc cacgctctgt 167 5 

/ ;'; :tgctac tgactggaga agctcattgc taagaccggc catgtgctcc actgacggca 17 35 

;,\:~7~tct cagagacgtt ggaagacagg caaaattcaa gggcatgatt ctactgggaa 1795 

*. :-: ~: \-v:a atcaaaatgg agtcatttgt gttaaaaacc ctgacaaata gagccggaga 1855 

a.. "aa gggagcagtc acgtaggcaa atgcctgatt acaagaacta tcacaaaagt 1915 

77 37 .v.-aaac cgcagctttg catgaagact attgcagcct tacacgcacg aaaatagttc 197 5 

73caaggaca tatgcccagc aacttcctgt ccacccttgg actggctcct cctttcttgg 2035 

ga7cc7tcca gccaaggata gtgacctcaa atcagttgtg tacctaacgt ttcctgtctt 20 95 

cctactgata aaacatagtt tcctatatcg tgtgtattcc cattgcaaca cttatttcca 2155 

aataaatat7 ttcttttaga gtctcaaaaa aaaaaaaaaa aaaa 2199 

(2) information for seq id no : 2 : 

(i) sequence characteristics: 

(A) LENGTH: 158 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ser Ser Ser Ser Ser Ala Thr Thr Thr Thr Pro GIu Thr Ser Thr Ser 
15 10 15 

Pro Lvs Phe His Thr Thr Thr Tyr Ser Thr Glu Arg Ser Glu His Phe 
20 25 30 

Lys Pro Cys Arg Asp Lys Asp Leu Ala Tyr Cys Leu Asn Asp Giy Glu 
35 40 45 

Cys Phe Val lie Glu Thr Leu Thr Giy Ser His Lys His Cys Arg Cys 
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50 55 60 

Lys Glu Gly Tvr Gin Gly Val Arg Cys Asp Gin Phe Leu Pro Lys Thr 
65 " 70 75 80 

Asp Ser lie Leu Ser Asp Pro Asn His Leu Gly lie Glu Phe Met Glu 
85 90 95 

Ser Glu Glu Val Tyr Gin Arg Gin Val Leu Ser lie Ser Cys lie lie 
100 105 110 

Phe Gly lie Val lie Val Gly Met Phe Cys Ala Ala Phe Tyr Phe Lys 
115 120 125 

Ser Lys Arg Asn lie Thr Ala Asn Ser Val Ser Glu Glu Arg Trp Lys 
130 135 140 

Gly Leu Pro Ser Gin Glu Pro Asn Leu Gin Gin Asp Lys 
14 5 150 155 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Ser Glu Arg Lys Glu Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys 
1 5 10 15 

Glu Arg Gly Ser Gly Lys Lys Pro Glu Ser Ala Ala Gly Ser Gin Ser 
20 25 30 

Pro Ala Leu Pro Pro Gin Leu Lys Glu Met Lys Ser Gin Glu Ser Ala 
35 40 45 

Ala Gly Ser Lys Leu Val Leu Arg Cys Glu Thr Ser Ser Glu Tyr Ser 
50 55 60 

Ser Leu Arg Phe Lys Trp Phe Lys Asn Gly Asn Glu Leu Asn Arg Lys 
65 " 70 75 80 

Asn Lvs Pro Gin Asn lie Lys lie Gin Lys Lys Pro Gly Lys Ser Glu 
85 90 95 

Leu Arg lie Asn Lys Ala Ser Leu Ala Asp Ser Gly Glu Tyr Met Cys 
100 105 110 

Lys Val lie Ser Lys Leu Gly Asn Asp Ser Ala Ser Ala Asn lie Thr 
115 " 120 125 

lie Val Glu Ser Asn Glu lie lie Thr Gly Met Pro Ala Ser Thr Glu 
13C 135 140 
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Gly Ala Tyr Val Ser Ser Glu Ser Pro lie Arg He Ser Val Ser Thr 
145 150 155 160 

Glu Gly Ala Asn Thr Ser Ser Ser Thr Ser Thr Ser Thr Thr Gly Thr 
165 170 175 

Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val Asn 
180 185 190 

Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg Tyr 
195 200 205 

Leu Cys Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn Tyr 
210 215 220 

Val Met Ala Ser Phe Tyr Lys His Leu Gly lie Glu Phe Met Glu Ala 
225 230 " 235 240 

Glu Glu Leu Tyr Gin Lys Arg Val Leu Thr He Thr Gly He Cys He 
245 250 255 

Ala Leu Leu. Val Val Gly lie Met Cys Val Val Ala Tyr Cys Lys Thr 
260 * 265 270 

Lys Lys Gin Arg Lys Lys Leu His Asp Arg Leu Arg Gin Ser Leu Arg 
275 * 280 285 

Ser Glu Arg Asn Asn Met Met Asn lie Ala Asn Gly Pro His His Pro 
290 295 300 

Asn Pro Pro Pro Glu Asn Val Gin Leu Val Asn Gin Tyr Val Ser Lys 
305 ; 310 315 320 

Asn Val lie Ser Ser Glu His lie Val Glu Arg Glu Ala Glu Thr Ser 
325 330 335 

Phe Ser Thr Ser His Tyr Thr Ser Thr Ala His' His Ser Thr Thr Vai 
340 * " 345 350 

Thr Gin Thr Pro Ser His Ser Trp Ser Asn Gly His Thr Glu Ser He 
355 360 365 

Leu Ser Glu Ser His Ser Val lie Vai Met Ser Ser Val Glu Asn Ser 
370 375 - 380 

Arg His Ser Ser Pro Thr Gly Gly Pro Arg Gly Arg Leu Asn Gly Thr 
385 390 395 400 

Gly Gly Pro Arg Glu Cys Asn Ser Phe Leu Arg His Ala Arg Glu Thr 
405 ' 410 415 

Pro Asp Ser Tyr Arg Asp Ser Pro His Ser Glu Arg Tyr Val Ser Ala 
420 425 430 

Met Thr Thr Pro Ala Arg Met Ser Pro Val Asp Phe His Thr Pro Ser 
435 440 445 

Ser Pro Lys Ser Pro Pro Ser Glu Met Ser Pro Pro Val Ser Ser Met 
450 455 460 

Thr Val Ser Met Pro Ser Met Ala Vai Ser Pro Phe Met Glu Glu Glu 
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470 475 480 



Arc P*-o- Leu Leu Leu Val Thr Pro Pro Arg Leu Arg Glu Lys Lys Phe 
485 490 495 

Asp His H^s Pro Gin Gin Phe Ser Ser Phe His His Asn Pro Ala His 
500 505 510 

a qD ser Asn Ser Leu Pro Ala Ser Pro Leu Arg lie Val Glu Asp Glu 
* P 515 520 525 

Glu Ty^- Glu Thr Thr Gin Glu Tyr Glu Pro Ala Gin Glu Pro Val Lys 
538 535 540 

Lys Leu Ala Asn Ser Arg Arg Ala Lys Arg Thr Lys Pro Asn Gly His 
545 550 555 560 

He Ala Asn Arg Leu Glu Val Asp Ser Asn Thr Ser Ser Gin Ser Ser 
565 570 575 

Asn Se- Glu Ser Glu Thr Glu Asp Glu Arg Val Gly Glu Asp Thr Pro 
.530 ■ 585 590 

Phe Leu Gly He Gin Asn Pro Leu Ala Ala Ser Leu Glu Ala Thr Pro 
595 600 605 

Ala Arg Leu Ala Asp Ser Arg Thr Asn Pro Ala Gly Arg Phe Ser 

610 615 620 

Thr Gin Glu Glu He Gin Ala Arg Leu Ser Ser Vai He Ala Asn Gin 
625 630 635 640 

Asp Pro lie Ala Val 
645 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 536 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

GGCACAGCTC TTCTTCCTCC TCCGCTACCA CCACCACACC AGAAACTAGC ACCAGCCCCA 60 

AATTTCATAC GACGACATAT TCCACAGAGC GATCCGAGCA CTTCAAACCC TGCCGAGACA 120 

AGGACCTTGG CATACTGTCT CAATGATGGC GAGTGCTTTG TGATCGAAAC CCTGACCGGA 180 

TCCCATTAAA CACTGTCGGT GCAAAGAAGG CTACCAAGGA GTCCGTTGTG ATCAATTTCT 24 0 

GCCGAAAACT GATTCCATCT TATCGGATCC AAACCACTTG GGGATTGGAA .TTCATGGGAG 3C0 

AGTGAAGAAG TTTTNNCCAA AGGGCAGGTG NTGTNCAATT TCCAAGTGNN CAACTTTGGG 3 60 
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GATTGGTNCN TCGTGGGGGC NTGTTTNNGG TGGCAGCATT TCNTAACTNC CAAAAAGCCA 
AAAAGGGATT .TTTNACCGGC AAATTTCCGT GNTCTGAAGG GAAAATTGGG AAGGGTCTTG 
CCCTTTCCCC AGGAGGCCCA ATTNGGNCAA CAAGGCCAAT NATGGCNTAA CAAGGG 
(2) INFORMATION FOR SEQ ID NO : 5 : 

9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
(Cj STRANDEDNESS: single 
(D; TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



430 
536 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

GGCGGATCCC TCTTCTTCCT CCTCC 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

GGCGAATTCT AAACTTCTTC ACTCTCCATG AATTCAATCC CC 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
(B; TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic} 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

GGCGGATCCC CTCTTCTTCC TCCTCC 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 
(A') LENGTH: 4 2 base pairs 
(31 TYPE: nucleic acid 
(C; STRANDEDNESS: single 
(D; TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCGGTACCT AAACTTCTTC ACTCTCCATG AATTCAATCC CC 4 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 base pairs 
{ B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
GCCGGATCCG CC AC CAT GAA CTCCTTCTCC ACAAGCGCCT TCGGTCCAGT TGCCTTCTCC 6 
CTGGGGCTGC TCCTGGTGTT GCCTGCTGCC TTCCCTGCCC CAGTCTCTTC TTCCTCCTCC 12 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCTCTAGAT AAACTTCTTC ACTCTCCATG AATTCAATCC CC 4 

(2) INFORMATION FOR SEQ ID NO: II: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 
(3) TYPE: amino acid 
(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

Ser His Phe Asn Asp Cys Pro Asp Ser His Thr Gin Phe Cys Phe His 
15 10 15 
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Giv Thr Cys Arg Phe Leu Val Gin Glu Asp Lys Pro Ala Cys Val Cys 
~ Y 20 25 30 

His Ser Gly Tyr Val Gly Ala Arg Cys Glu His Ala Asp Leu Leu Ala 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

A-c Asn Se^ Asp Ser Glu Cvs Pro Leu Ser His Asp Gly Tyr Cys Leu 
1 5 10 15 

His Asp Gly Val Cys Met Tyr He Glu Ala Leu Asp Lys Tyr Ala Cys 
20 25 30 

Asn Cys Val Val Gly Tyr He Gly Glu Arg Cys Gin Tyr Arg Asp Leu 

: 35 40 45 



Lys Trp 
50 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Lys Lys Arg Asp Pro Cys Leu Arg Lys Tyr Lys Asp Phe Cys He 
15 10 15 

His Giy Glu Cvs Lys Tvr Val Lys Glu Leu Arg Ala Pro Ser Cys He 

2 0 2 5 30 -.. :v 

Cvs H^s' Pro Glv Tyr Gly Giy Glu Arg Cys His Gly Leu Ser Leu Pro 
- " 35 ' 40 45 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A), LENGTH: 4 9 amine acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 4 : 

A-q Lys Lvs Lys Asn Pro Cys Asn Ala Giu Phe Gin Asn Phe Cys lie 
1 " 5 10 IS 

H^s Glv Glu Cvs Lys Tyr He Glu His Leu Glu Ala Val Thr Cys Lys 
20 25 30 

Cys Gin Gin Glu Tyr Phe Gly Giu Arg Cys Giy Glu Lys Ser Met Lys 
35 40 45 

Thr 



(2) INFORMATION FOR SEQ ID NO : i 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 5 : 

Lvs Gly His Phe Ser Arg Cys Pro Lys Gin Tyr Lys His Tyr Cys He 

Lvs Gly Arg Cys Arg Phe Val Val Ala Giu Gin Thr Pro Ser Cys Val 
20 25 30 

Cys Asp Glu Gly Tyr He Gly Ala Arg Cys Glu Arg Val Asp Leu Phe 
35 40 45 

Tyr 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 
{B; TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
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Thr Ser His Leu lie Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 
As n Gly Gly Glu Cys Phe Thr Val Lys Asp Leu Ser Asn Pro Ser Arg 



20 



Tvr Leu Cys -Lys Cys Pro Gly Phe Thr Gly Ala Arc, Cys Thr Glu Asn 

35 40 

Val Pro Met Lys 
5Q 

(2) INFORMATION FOR SEQ ID NO: 1*7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 11: 
Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 
1 5 io 

Asn Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arg 

20 25 
Tyr Leu Cys Lys Cys Gin Pro Gly Phe Thr Gly Ala Arg Cys Thr Glu 
35 40 

Asn Val Pro Met Lys 
50 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



-V " -O IS 3 



Thr Ser His Leu Val Lys Cys Ala Glu Lys Glu Lys Thr Phe Cys Val 
AS n Gly Gly Glu Cys Phe Met Val Lys Asp Leu Ser Asn Pro Ser Arc. 



20 25 



Tvr Leu Cvs Lys Cys Pro Asn Glu Phe Thr Gly Asp Arg Cys Gin Asn 

40 " 
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Tyr Val Met Ala Ser 
50 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



Ser Gly His Ala Arg Lys Cys As 
1 



n Glu Thr Ala Lys Ser Tyr Cys Val 
5 10 15 

Asn GW G>y Val Cys Tvr Tyr He Glu Gly lie Asn Gin Leu Ser Cys 
20 25 30 

Lvs Cys Pro Val Gly Tyr Thr Gly Asp Arg Cys Gin Gin Phe Ala Met 
35 .40 45 



Val Asn 
50 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCACGATGAC AATTCCAAAG 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TACCACCACC ACACCAGAAA 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 720 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Ser Glu Giy Ala Ala Ala Ala Ser Pro Pro Gly Ala Ala Ser Ala 
I 5 10 15 

Ala Ala' Ala Ser Ala Glu Glu Glv Thr Ala Ala Ala Ala Ala Ala Ala 
20 25 30 

A^a Ala Gly Gly Giy Pro Asp Gly Gly Gly Giu Gly Ala Ala Glu Pro 
35 40 . _ 45 

Pro Arg Glu Leu Arg Cys Ser Asp Cys He Val Trp Asn Arg Gin Gin 
50 55 60 

Thr Trp .Leu Cys Val Val Pro Leu Phe He Gly Phe He Gly Leu Gly 
65 ' 70 7 5 80 

Leu Ser Leu Met Leu Leu Lys Trp He Val Val Gly Ser Val Lys Glu 
85 90 95 

Tyr Val Pro Thr Asp Leu Val Aso Ser Lys Gly Met Giy Gin Asp Pro 
100 105 HO 

Phe Phe Leu Ser Lys Pro Ser Ser Phe Pro Lys Ala Met Glu Thr Thr 
115 120 125 

Th- Th*" Thr Thr Ser Thr Thr Ser Pro Ala Thr Pro Ser Ala Gly Giy 
130 135 140 

Ala Ala Ser Ser Arg Thr Pro Asn Arg He Ser Thr Arg Leu Thr Thr 
145 150 155 160 

He Thr Arg Ala Pro Thr Arg Phe Pro Gly His Arg Val Pro lie Arg 

170 1 7 5 



165 



Ala Se- Pro Arg Ser Thr Thr Ala Arg Asn Thr Ala Ala Pro Ala Thr 
180 185 190 

Vai °ro Ser Thr Thr Ala Pro Phe Phe Ser Ser Ser Thr Leu Gly Ser 

195 200 205 

Arg ?-o Pro Val Pro Giy Thr Pro Ser Thr Gin Ala Met Pro Ser Trp 
210 : 215 220 

°ro Th- Pla Ala Tyr Ala Thr Ser Ser Tyr Leu His Asp Ser Thr Pro 
225 " 230 235 240 

Ser Tro Thr Leu Ser Pro Phe Gin Asp Ala Ala Ser Ser Ser Ser Ser 
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245 250 255 

Ser Ser Ser Ser Ser Thr Thr Thr Thr Pro Glu Thr Ser Thr Ser Pro 
260 265 270 

t ys p he H<s Thr Thr Thr Tyr Ser Thr Glu Arg Ser Giu His Phe Lys 
275 280 285 

p-c <~vs Arc Asp Lys Asp Leu Ala Tyr Cys Leu Asn Asp Gly Giu Cys 
290 ' 29S 300 

Ph^ Val He Glu Thr Leu Thr Gly Ser His Lys His Cys Arg Cys Lys 
305 310 315 320 

GJ- GW Tvr* Gin Giy Val Arc Cys Asp Gin Phe Leu Pro Lys Thr Asp 
"~ - " 325 ' 330 335 

Ser T le Leu Ser Asp Pro Thr Asp His Leu Gly lie Giu Phe Met Giu 
340 345 350 

Ser - r,i u Giu Val Tyr Gin Arg Gin Val Leu Ser He Ser Cys lie He 
355. 360 365 

Phe Giy lie Val He Val Gly Met Phe Cys Ala Ala Phe Tyr Phe Lys 
370 375 380 

Se- L-vs Lys Gin Ala Lys Gin lie Gin Giu Gin Leu Lys Val Pro Gin 
385 390 395 400 

Asn Gly Lys Ser Tyr Ser Leu Lys Aia Ser Ser Thr Met Ala Lys Ser 
405 410 415 

Giu Asn Leu Val Lys Ser His Val Gin Leu Gin Asn Tyr Ser Lys Val 
420 " 425 430 

Glu £-g h<s Pro Val Thr Aia Leu Glu Lys Met Met Glu Ser Ser Phe 
435 440 445 

Val Giy Pro Gin Ser Phe Pro Glu Val Pro Ser Pro Asp Arg Giy Ser 
450 ^ 455 460 

G in s~>- Val Lvs His His Arc Ser Leu Ser Ser Cys Cys Ser Pro Gly 
465 " * 470 475 480 

G'n S*>r Gly Met Leu His Arg Asn Ala Phe Arg Arg Thr Pro Pro 

485 490 495 

Se- P-o Arg Ser Arg Leu Giy Giy He Val Gly Pro Aia Tyr Gin Gin 
500 505 510 

Le- Giu Giu Ser Ara He Pre Asp Gin Asp Thr lie Pro Cys Gin Gly 
515 520 525 

12 e Giu Val Arg Lys Thr He Ser His Leu Pre He Gin Leu Trp Cys 
530 535 540 

Val Glu Arg Pro Leu Asp Leu Lys Tyr Ser Ser Ser Gly Leu Lys Thr 
545 550 555 560 

G'n Asn Thr Ser He Asr. Met Glr. Leu Pro Ser Arg Giu Thr Asr. 

565 570 575 
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Pro Tyr Phe Asn Ser Leu Glu Gin 
580 

Thr Arg Ala Ser Ser Val Pre lie 
595 600 

Thr Cys Leu Gin Met Pro Gly lie 
610 615 

Cys Lys Asn Ser Tyr Ser Ala Asp 
625 630 

Ser Asp Cys Leu lie Ala Glu Gin 
■ " 645 



Lvs Asp Leu Val Gly Tyr Ser Ser 
585 590 

lie Pro Ser Val Gly Leu Glu Glu 
605 

Ser Glu Val Lys Ser lie Lys Trp 
620 

Val Val Asn Val Ser lie Pro Val 
635 640 

Gin Glu Val Lys Tie Leu Leu Glu 
650 * 655 



Thr Val Gin Glu Gin lie Arg lie Leu Thr Asp Ala Arg Arg Ser Glu 
660 665 670 

Aso Tyr Glu Leu Ala Ser Val Glu Thr Glu Asp Ser Ala Ser Glu Asn 
675 630 635 

Tr.r Ala Phe Leu Pro Leu Ser Pro Thr Ala Lys Ser Glu Arg Glu Ala 
690 ? 695 700* 

• Phe Val Leu Arg Asn Glu lie Gin Arg Asp Ser Ala Leu Thr Lys 

710 715 720 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRule \3bisJ 



A. The indications made below relate to the microorganism referred to in the description 
on page Jine 



I B. IDENTIFICATION OF DEPOSIT 

Name of depositary institution 



Further deposits are identified on an additional sheel 



American Type Culture Collection 



Address of depositary institution (including postal code and country') 

10801 University Boulevard 
Manassas, Virginia 20110-2209 
United States of America 



; Date of deposit June 19, 1997 



Accession Number 209 1 23 



ADDITIONAL INDICATIONS (leave blank if no, applicable) This information is continued on an additional sheel 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if , he indications are no, for alt designated States) 



IE. SEPARATE FURNISHING OF INDICATIONS (leave blank if no, applicable) 



1 The indications listed beiow will be submitted to the Inter national Bureau later (specify- ,he general na.ure of, he indications, e.g.. "Accession 
Number of Deposit") 



s For receiving Office use only , 



This sheet was received with the international application 



Authorized officer 

ft uhM~ 



. For International Bureau use only 



□ 



This sheet was received by the International Bureau on: 



Authorized officer 
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What Is Claimed Is: 

1 . An isolated nucleic acid molecule nucleic acid molecule comprising a 
polynucleotide having a nucleotide sequence at least 95% identical to a sequence selected from 
the group consisting of: 

(a) a nucleotide sequence encoding the HLF polypeptide having the amino acid 
sequence at positions 1 to 157 of SEQ ID NO:2) or the complete amino acid sequence encoded 
by the cDNA clone contained in ATCC Deposit No. 209123; 

(b) a nucleotide sequence encoding the predicted extracellular domain of the HLF 
polypeptide having the amino acid sequence in SEQ ID NO:2 (i.e., positions 1 to 101 of SEQ 
ID NO:2) or as encoded by the cDNA clone contained in ATCC Deposit No. 209123; 

(c) a nucleotide sequence encoding the predicted transmembrane domain of the HLF 
polypeptide having the amino acid sequence in SEQ ID NO:2 (i.e., positions 102 to 121 of 
SEQ ID NO:2) or as encoded by the cDNA clone contained in ATCC Deposit No. 209123; 

(d) a nucleotide sequence encoding the predicted intracellular domain of the HLF 
polypeptide having the amino acid sequence in SEQ ID NO:2 (i.e., positions 122 to 157 of 
SEQ ID NO:2) or as encoded by the cDNA clone contained in ATCC Deposit No. 209123; 

(e) a nucleotide sequence encoding a soluble HLF polypeptide having the extracellular 
and intracellular domains but lacking the transmembrane domain; and 

(0 a nucleotide sequence complementary to any of the nucleotide sequences in (a) 
through (e) above. 

2. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
complete nucleotide sequence in Figures 1A and IB (SEQ ID NO:l). 

3. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
nucleotide sequence in Figures 1A and IB (SEQ ID NO: 1) encoding the HLF polypeptide 
having the amino acid sequence in positions 1 to 1 57 of SEQ ID NO:2. 

4. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
nucleotide sequence in Figures I A and IB (SEQ ID NO:l) encoding the extracellular domain of 
the HLF polypeptide having the amino acid sequence from about 1 to about 101 in SEQ ID 
NO:2. 
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5 . An isolated nucleic acid molecule comprising a polynucleotide having a 
nucleotide sequence at least 95% identical to a sequence selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising the amino acid 
sequence of residues n-157 of SEQ ID NO: 2, where n represents an integer from 1 to 35; 

(b) a nucleotide sequence encoding a polypeptide comprising the amino acid 
sequence of residues 1-m of SEQ ID NO:2, wherein m represents an integer from 73 to 101 ; 

(c) a nucleotide sequence encoding a polypeptide having the amino acid sequence 
consisting of residues n-m of SEQ ID NO:2, where n and m are integers as defined respectively 
in (a) and (b) above; and 

(d) a nucleotide sequence encoding a polypeptide consisting of a portion of the 
complete HLF amino acid sequence encoded by the cDNA clone contained in ATCC Deposit 
No. 209123 wherein said portion excludes from 1 to about 35 amino acids from the amino 
terminus of said complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 209123; 

(e) a nucleotide sequence encoding a polypeptide consisting of a portion of the complete 
HLF amino acid sequence encoded by the cDNA clone contained in ATCC Deposit No. 
209123 wherein said portion excludes from 1 to about 83 amino acids from the carboxy 
terminus of said complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Dteposit No. 209123; and 

(0 a nucleotide sequence encoding a polypeptide consisting of a portion of the complete 
HLF amino acid sequence encoded by the cDNA clone contained in ATCC Deposit No. 
209123 wherein said portion include a combination of any of the amino terminal and carboxy 
terminal deletions in (d) and (e), above. 

6. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
complete nucleotide sequence of the cDNA clone contained in ATCC Deposit No. 209123. 

7. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
nucleotide sequence encoding the HLF polypeptide having the complete amino acid sequence 
encoded by the cDNA clone contained in ATCC Deposit No. 209123. 

8. The nucleic acid molecule of claim 1 wherein said polynucleotide has the 
nucleotide sequence encoding the extracellular domain of the HLF polypeptide having the 
amino acid sequence encoded by the cDNA clone contained in ATCC Deposit No. 209123. 
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9. An isolated nucleic acid molecule comprising a polynucleotide which hybridizes 
under stringent hybridization conditions to a polynucleotide having a nucleotide sequence 
identical to a nucleotide sequence in (a), (b), (c), (d), or (e) of claim 1 wherein said 
polynucleotide which hybridizes does not hybridize under stringent hybridization conditions to 
a polynucleotide having a nucleotide sequence consisting of only A residues or of only T 
residues. 

10. An isolated nucleic acid molecule comprising a polynucleotide which encodes 
the amino acid sequence of an epitope-bearing portion of a HLF polypeptide having an amino 
acid sequence in (a), (b), (c), (d), or (e) of claim 1. 

1 1 . The isolated nucleic acid molecule of claim 10. which encodes an 
epitope-bearing portion of a HLF polypeptide wherein the amino acid sequence of said portion 
is selected from the group of sequences in SEQ ID NO:2 consisting of: about Ser-1 to about 
Thr-8, about Thr-9 to about Lys-18, about Thr-23 to about His-31, about Phe-32 to about 
Leu-40, about Cys-43 to about Val-51, about Thr-56 to aboutTyr-68, about Gln-75 to about 
Lcu-84, about Tyr-126 to about Ala- 135, about Ser-1 37 to about Leu- 146, and about Ser-1 48 
to about Lys-157. 

12. A method for making a recombinant vector comprising inserting an isolated 
nucleic acid molecule of claim 1 into a vector. 

13. A recombinant vector produced by the method of claim 1 2, 

14. A method of making a recombinant host cell comprising introducing the 
recombinant vector of claim 13 into a host cell. 

15. A recombinant host cell produced by the method of claim 14. 

1 6. A recombinant method for producing a HLF polypeptide, comprising culturing 
the recombinant host cell of claim 15 under conditions such that said polypeptide is expressed 
and recovering said polypeptide. 
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17. An isolated HLF polypeptide comprising an amino acid sequence at least 95% 
identical to a sequence selected from the group consisting of: 

(a) the amino acid sequence of the HLF polypeptide having the complete amino acid 
sequence shown in SEQ ID NO:2 (i.e., positions 1 to 157 of SEQ ID NO:2) or the complete 
amino acid sequence encoded by the cDNA clone contained in the ATCC Deposit No. 209123; 

(b) the amino acid sequence of the predicted extracellular domain of the HLF 
polypeptide having the amino acid sequence shown in SEQ ID NO:2 (i.e., positions 1 to 101 of 
SEQ ID NO:2) or as encoded by the cDNA clone contained in the ATCC Deposit No. 209123; 

(c) the amino acid sequence of the predicted transmembrane domain of the HLF 
polypeptide having the amino acid sequence shown in SEQ ID NO:2 (i.e., positions 102 to 121 
of SEQ ID NO:2) or as encoded by the cDNA clone contained in the ATCC Deposit No. 
209123; 

(d) the amino acid sequence of the predicted intracellular domain of the HLF 
polypeptide having the amino acid sequence shown in SEQ ID NO:2 (i.e., positions 122 to 157 
of SEQ ID NO:2) or as encoded by the cDNA clone contained in the ATCC Deposit No. 
209123; and 

(e) the amino acid sequence of a soluble HLF polypeptide having the extracellular and 
intracellular domains but lacking the transmembrane domain. 

18. An isolated polypeptide comprising an epitope-bearing portion of the HLF 
protein, wherein said portion is selected from the group consisting of: a polypeptide comprising 
amino acid residues from about Ser-1 to about Thr-8, about Thr-9 to about Lys-18, about 
Thr-23 to about His-31, about Phe-32 to about Leu-40 ? about Cy$-43 to about Val-51, about 
Thr-56 to aboutTyr-68, about Gln-75 to about Leu-84, about Tyr-126 to about Ala- 135, about 
Ser-1 37 to about Leu- 146, and about Ser-1 48 to about Lys-157 of SEQ ID NO:2. 

1 9. An isolated antibody that binds specifically to a HLF polypeptide of claim 17. 
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20. An isolated nucleic acid molecule comprising a polynucleotide having a 
sequence at least 95% identical to a sequence selected from the group consisting of: 

(a) the nucleotide sequence of clone HAGFE38R (SEQ ID NO:4); 

(b) the nucleotide sequence of a portion of the sequence shown in Figures 1 A and 
IB (SEQ ID NO: 1 ) wherein said portion comprises at least 50 contiguous nucleotides from 
nucleotideabout 1 to about 220 and from about 400 to 2199; 

(c) the nucleotide sequence of a portion of the sequence shown in Figures 1 A and 
IB (SEQ ID NO:l) wherein said portion consists of residues 1 to 2199, 1 to 1500, 1 to 1000, 
1 to 500, I to 250, 250 to 2199, 250 to 1500, 250 to 1000, 250 to 500, 500 to 2199, 500 to 
1500, 500 to 1000, 1000 to 2199, and 1000 to 1500; and 

(d) a nucleotide sequence complementary to any of the nucleotide sequences in: (a), 
(b), or (c) above. 

21. A method for diagnosing cancer in a human comprising, 

(a) assaying HLF gene expression level in cells or body fluid of an individual; and 

(b) comparing the HLF gene expression level with a standard HLF gene expression 
level, whereby an increase or decrease in the assayed HLF gene expression level compared to 
the standard expression level is indicative of cancer in the tissue type assayed. 
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1 CTCTTCTTCCTCCTCCGCTACCACCACCACACCAGAAACTAGCACCAGCCCCAAATTTCA 60 

1 SSSSSATTTTPETSTSPKFH 20 

61 TACGACGACATATTCCACAGAGCGATCCGAGCACTTCAAACCCTGCCGAGACAAGGACCt 

21 T T T Y S TERSEHFKPCRDKDL 

1 2 1 TGC AT ACTGtCTCMTGATGGCGAGTGCTTTGTGATCGAAACCCTGACCGGATCCCATAA 

41 AYCLNDGECFVIETLTGSHK 

1 8 1 AC ACTGTCGGTGCAMGMGGCTACCMGGAGTCCGTTGtGATCMTTTCTGCCGAAAAC 

61 HCRCKEGYQG VRCDQFL P K T 

24 1 TGATTCCATCTTATCGGATCCAAACCACTtGGGGATTGAATTCATGGAGAGTGAAGAAGt 

81 DSILSDPNHLGIE F M E S E E V 

301 TTATC AAAGGCAGGTGCTGTCAATTTCATGTATCATCTTTGGAATTGTCATCGTGGGCAt 

101 YORQVLSISCI IFGIVIVGM 

361 GTTCTGTGCAGCATTCTACTTCAAMGCAAAAGGAATATtACAGCAAATTCTGTGTCTGA 

121 FCAAFYF KSKRNI TANSVSE 

421 GGAAAGATGGAAGGGTCTGCCTTCCCAGGAGCCCAATCTGCAACAAGACAAATAATGCCt 

v 141. ERWKGLP5 QEPNLQQDK* 

481 AACAATGGAtTAATGATGTCTACTATTCTGCAACTTACAtCTCATTTCTtTCTAATGCAt 

54 1 TGGACCAGAGAMTTTAAAACTCAAATGAACTGTAAAGTtTCCACACTGACACTGTTGGG 

60 1 CT AAT AGTATTCCCATGTGC AAGGCATGCATCTTTTCTTCCCC AGAGCAATGCCTCTCAt 

661 GAGAGAGCTAATGGTATTGCAATCAGCTGCTGATTGTTTtCTCTGTTCCCATTTTCTGGG 

721 TGAAGGAAGAAAGAGCAAAAAAGTGTGTGCTTGTGAGAGAGGAGGGATGGTAGATAGGCA 

78] GAGGCAGGCtCAGAATGGAAGGACCACGTATCTTGGAATATTACTAAGTCAGGACTTGAG 

84 1 TGAAAAAAGACTAAAGGTAAGCAAATTATAAAAGGATTTAGGAAACGCAGTCCGGTATTG 

901 GATATTGCTtAAAGAAAATtCCCTTATAAGTTTATACTTCCAAGACTCTGAATTGGATTA 

961 CTGCAMCATCATTMGTGtTTCTAATTTAATCCCATGAGAGTAATGGAATCCTTGCTCt 

FIG. 1 A 
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1021 


GAGACATGCACTCTTACTTTTTCAGGATGATTTACCAGACTAGAACCTCCTGATTTCCCC 


1080 


1081 


TTTTTTGTGTGTGTGAATGAACCCCTGATAAMTCTTGTGGCTGTAACAtGCTCCTTAAA 


1140 


1141 


ATGCTGAtAtGATAGATTTATTTTTAACAATAGGCTATAGATTAGCTGTtAGGAAGCAAA 


1200 


1201 


TAGATTATTACAACAGGATtAAAGCAACTAAGAGTGCTAGAGATAAAAGtCTCCCAAATA 


1260 


1261 


ATTGGAMGATAAMGAMtATCTTAAAAAACAGAGCTACATCACACTGATATTGTAAAt 


1320 


1321 


TCAAAATGGGTAATGAAGCtCAAAGCCTCCAAAGCTTGCAGCAAGTGCTGGTGAATTGCt 


1380 


1381 


TGGGAAGATGCAACTAGTGTAATCTTTTACCTTTGGGTCAATGTTCTGAtTCTTTTGCAG 


1440 


1441 


CTTCTGCTCACAAGACTGAGCTTGCTTGATGGTATCGGGAAAGATATGAACATTTTGCGT 


1500 


1501 


GTGCCTCCACATGCAGCCACCACAGTGTCCGTGGAAGATAGCTTTTATGAACTTCATTTA 


1560 


1561 


CAGAGGAGGAAATGGAGGCTCAACAAGTTTAGGAMTTAtTAGGGTAGCAAAACTAGTGG 


1620 


1621 


GTAGCAGAGtGGGATTCAAATCCCAGTCCCTGTGATACAATAAGCCACGCTCTGTAGGGt 


1680 


1681 


GCTACTGACtGGAGAAGCTCATTGCTMGACCGGCCATGTGCTCCACTGACGGCACTATC 


1740 


1741 


TTTGTCAGAGACGTTGGAAGACAGGCAAAATTCAAGGGCATGATTCTACtGGGAAAGTTG 


1800 


1801 


TCAGAATCAAAATGGAGTCA 1 i 1 GTGTTAAAAACCCTGACAAATAGAGCCGGAGAAGGAC 


1860 


1861 


ATGAAGGGAGCAGTCACGTAGGCAAATGCCTGATTACAAGAACTATCACAAAAGTCTGTG 


1920 


1921 


AAAACCGCAGCTTTGCATGAAGACTATTGCAGCCTTACACGCACGAAAAtAGTTCTGCAA 


1980 


1981 


GGACATATGCCCAGCAACTtCCTGTCCACCCTTGGACTGGCTCCTCCTTTCTTGGGATCC 


2040 


2041 


TTGCAGCCAAGGATAGTGACCTCAAATCAGTTGTGTACCTAACGTTTCCTGTCTTCCTAG 


2100 c ;*<a 


2101 


TGATAAAACATAGTTTCCTATATCGTGTGtATTCCCATTGCAACACTTAtTTCCAAATAA 


2160 


2161 


ATATTTTCTtTTAGAGTCTCAAAAAAAAAAAAAAAAAAA 2199 





FIG.1B 
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2 SSSSATTTTPETSTSPKFHTTTYSTERSEHFKPCRDKDLAYCLNDGECFV 51 

M-|: I I- II- -.|: -I :|: .:|:|:||||: 

149 S5ESPIRISVSTEGANTSSSTSTSTTGTSHLVKCAEKEKTFCVNGGECFM 198 

52 IETLTGSHKH.CRCKEGYQGVRCDQFLPKTD5ILSDPNHLGIEFMESEEV 100 
: • • I • ■ • ; ■ I : I : I I I : • : : : | . | | | | 1 1 | | . | I : 

199 VKDLSNPSRYLCKCPNEFTGDRCQNYV MASFYKHLGIEFMEAEEL 243 

101 YQRQVLS I SC 1 1 FG I V I VGMFCAAF YFKSKRN I TANS VSEERWKGLPSQE 150 

I I : • I I • I • ■ I • : : : : : I I : ■• I • • I I • | : • ..: : :.:|.|: 
244 YQKRVLTITGICIALLVVGIMCVVAYCKTKKQ . . RKKLHDRLRQSLRSER 291 

151 PNLQQ 155 
I: • 

292 NNMMN 296 

FIG. 2 
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6 14 20 31 42 

otTGF SHFNDCPDSHTQFCFHG - TCRFLVQEDKP - - -ACVCHSGYVGARCEHADLLA 

EGF RNSDSECPLSHDGYCJJHDGVCMY I E ALOKY - - -ACNCVVGYIGERCOYRDLKW 

HB-EGF GKKRDPCLRKYKDFCIHG-ECKYVKELRAP- - -SCICHPGYGGERCHGLSLP 

Amph RKKKNPCNAEFQNFCIHG-ECKYIEHLEAV- - -TCKCQQEYFGERCGEKSMKT 

Pcell KGHFSRCPKQYKHYCIKG-RCRFVVAEQTP- - -SCVCDEGYIGARCERVDLFY 

neuR TSHLIKCAEKEKTFCVNGGECFTVKDLSNPSRYLCKCOPGFTGARCTENVPMK 

M-gal TSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCQPGFTGARCTENVPMK 

ttrqpi TSHLVKCAEKEKTFCVNGGECFMVKDLSNPSRYLCKCPNEFTGDRCQNYVMAS 

i IRG - 2 SGHARKCNETAKSYCVNGGVCYY I EGINQLS - - -CKCPVGYTGDRCQQFAMVN 

HL F SEHFKPCRDKDLAYCLNDGECF V I ETLTGSHK - HCRCKEGYQGVRCDQFLPKT 

FIG. 5 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRule \ 3bis) 



A. The indications made below relate to the microorganism referred to in the description 



on page 



line 33 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet 



Name of depositary institution 



American Type Culture Collection 



Address of depositary institution {including postal code and country) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
United States of America 



Date of deposit June 19, 1997 


Accession Number 209 123 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet 



DNA Plasmld HAGFE38X 

lh respect of those designations in which a European Patent is sought a sample of the deposited microorganism will be 
available until the publication of the mention of the grant of the European patent or until the date on which the application 
has been refused or withdrawn or is deemed to be withdrawn, only by the issue of such a sample to an expert nominated 
by the person requesting the sample (Rule 28(4)EPC). 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated Stales) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the international Bureau later (specify the general nature of the indications, e.g.. "Accession 
Mumber of Deposit") 



For receiving Office use only , 
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For International Bureau use only 



□ 



This sheet was received bv the International Bureau on: 



Authorized o ftlcer 



CANADA 

The applicant requests that, until either a Canadian patent has been issued on the basis of an 
application or the application has been refused, or is abandoned and no longer subject to 
reinstatement, or is withdrawn, the Commissioner of Patents only authorizes the furnishing of 
a sample of the deposited biological material referred to in the application to an independent 
expert nominated by the Commissioner, the applicant must, by a written statement, inform the 
International Bureau accordingly before completion of technical preparations for publication 
of the international application. 

NORWAY 

The applicant hereby requests that the application has been laid open to public inspection (by 
the Norwegian Patent Office), or has been finally decided upon by the Norwegian Patent 
Office without having been laid open inspection, the furnishing of a sample shall only be 
effected to an expert in the an. The request to this effect shall be filed by the applicant with 
the Norwegian Patent Office not later than at the time when the application is made available 
to the public under Sections 22 and 33(3) of the Norwegian Patents Act. If such a request has 
been filed by the applicant, any request made by a third party for the furnishing of a sample 
shall indicate the expert to be used. That expert may be any person entered on the list of 
recognized experts drawn up by the Norwegian Patent Office or any person approved by the 
applicant in the individual case. 

AUSTRALIA 

The applicant hereby gives notice that the furnishing of a sample of a microorganism shall 
only be effected prior to the grant of a patent, or prior to the lapsing, refusal or withdrawal of 
the application, to a person who is a skilled addressee without an interest in the invention 
(Regulation 3.25(3) of the Australian Patents Regulations). 

FINLAND 

The applicant hereby requests that, until the application has been laid open to public 
inspection (by the National Board of Patents and Regulations), or has been finally decided 
upon by the National Board of Patents and Registration without having been laid open to 
public inspection, the furnishing of a sample shall only be effected to an expert in the art. 

UNITED KINGDOM 

The applicant hereby requests that the furnishing of a sample of a microorganism shall only 
be made available to an expert. The request to this effect must be filed by the applicant with 
the International Bureau before the completion of the technical preparations for the 
international publication of the application. 
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of recognized experts drawn up by the Paten Of^T" ^ ^ """^ ° n 3 "* 

the individual case. 0ff,Ce ° r ^ P erson b X the applicant in 



SWEDEN 



The applicant hereby requests that, until the aonlirari™ u a * u i 
inspection (by the Swedish Patent Offi.U a P pl,C u ano1 ! has been »«d open to public 
Patent Offi^wiii^,^^^^^.^. «^ UP ° n by the Swedish 
shall „„iv be effected to an expen in 1 an t£ mSpect,on - finishing of a sample 



NETHERLANDS 



made available as S ^ T^i "danism shall be 

an expert. The request lo th s effect ™«Kc ! i u " " °" ly by the issue of a sam P le 10 
Industrial Propem Offiee teforl AeZe „ ?Ti" ^ applic,m W " h "* Netherlands 
public under Secon 22C or S« 5 of It'? ^Ijcation is available to the 
whichever of the ,„" d£ occ^eX Kingd ° m ° f "> e Netherlands, 
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